Tuning SQL isn’t always easy, and it takes a lot of practice to recognise how any given query can be optimised. One of the most important slides of my SQL training is the one summarising “how to be fast”:
Some of these bullets were already covered on this blog. For instance avoiding needless, mandatory work, when client code runs queries or parts of queries that aren’t really necessary (e.g. selecting too many columns: “needless”), but the database cannot prove they’re needless, thus: “mandatory” for the database to execute.
But as with many other performance related topics, one key message is not to guess, but to measure! Or, in other words, not to optimise prematurely, but to optimise actual problems.
SQL is full of myths
SQL is a 4GL (Fourth-generation programming language) and as such, has always been a cool, convenient way to express data related constraints and queries. But the declarative nature of the language also often meant that programmers are really looking into a crystal ball. A lot of people have blogged about a lot of half-true discoveries that might have been correct in some context and at some point of time (this blog is no exception).
For instance:
- Are correlated subqueries slower than their
LEFT JOIN
equivalents? - Are derived tables faster than views or common table expressions?
- Is
COUNT(*)
faster thanCOUNT(1)
?
Tons of myhts!
Measure your queries
To bust a myth, if you have good reasons to think that a differently written, but semantically equivalent query might be faster (on your database), you should measure. Don’t even trust any execution plan, because ultimately, what really counts is the wall clock time in your production system.
If you can measure your queries in production, that’s perfect. But often, you cannot – but you don’t always have to. One way to compare two queries with each other is to benchmark them by executing each query hundreds or even thousands of times in a row.
As any technique, benchmarking has pros and cons. Here is a non-exhaustive list:
Pros
- Easy to do (see examples below)
- Easy to reproduce, also on different environments
- Easy to quickly get an idea in terms of orders of magnitude difference
Cons
- Not actually measuring productive situations (no one runs the same query thousands of times in a row, without any other queries in parallel)
- Queries may profit from unrealistic caching due to heavy repetition
- “Real query” might be dynamic, so the “same query” might really manifest itself in dozens of different productive queries
But if you’re fine with the cons above, the pros might outweigh, for instance, if you want to find out whether a correlated subquery is slower than its LEFT JOIN
equivalent for a given query. Note my using italics here, because even if you find out it’s slower for that given query it might be faster for other queries. Never jump to generalised rules before measuring again!
For instance, consider these two equivalent queries that run on the Sakila database. Both versions try to find those actors whose last name starts with the letter A and counts their corresponding films:
LEFT JOIN
SELECT first_name, last_name, count(fa.actor_id) AS c FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id WHERE last_name LIKE 'A%' GROUP BY a.actor_id, first_name, last_name ORDER BY c DESC
Correlated subquery
SELECT first_name, last_name, ( SELECT count(*) FROM film_actor fa WHERE a.actor_id = fa.actor_id ) AS c FROM actor a WHERE last_name LIKE 'A%' ORDER BY c DESC
The result is always:
The queries have different execution plans on PostgreSQL, Oracle, SQL Server as can be seen below:
PostgreSQL LEFT JOIN
(Plan looks “better”)
PostgreSQL correlated subquery
(Plan looks “worse”)
Oracle LEFT JOIN
(Plan looks “more complicated”)
Oracle correlated subquery
(Plan looks “simpler”)
SQL Server LEFT JOIN
(Plan looks “reasonable”)
SQL Server correlated subquery
(Plan looks… geez, where’s my correlated subquery? It’s been transformed to a LEFT JOIN!)
Huh, as you can see, in SQL Server, both queries produce the exact same plan (as they should, because the queries are really equivalent). But not all databases recognise this and/or optimise this. At least, that’s what the estimated plans suggest.
Also, don’t jump to the conclusion that if the cost of one plan is lower then it’s a better plan than an alternative. Costs can only really be compared when comparing alternative plans for the same query, e.g. in the Oracle example, we had both HASH JOIN and NESTED LOOP JOIN in a single plan, because Oracle 12c may collect runtime statistics and switch plans in flight thanks to the Oracle 12c Adaptive Query Optimization features.
But let’s ignore all of this and look at actual execution times, instead:
Benchmarking the alternatives
As always, disclaimer: Some commercial databases do not allow for publishing benchmark results without prior written consent. As I never ask for permission, but always ask for forgiveness, I do not have consent, and I’m thus not publishing actual benchmark results.
I have anonymized the benchmark results by introducing hypothetical, non-comparable units of measurement, so you cannot see that PostgreSQL is totally slower than Oracle and/or SQL Server. And you cannot see that SQL Server’s procedural language is totally uglier than PostgreSQL’s and/or Oracle’s.
Legal people.
Solving problems we wouldn’t have without legal people, in the first place
Enough ranting. Some important considerations:
- Ideally, you’ll run benchmarks directly in the database using a procedural language, rather than, e.g. over JDBC to avoid network latency that incurs with JDBC calls, and other non-desired side-effects.
- Repeat the benchmarks several times to prevent warmup side-effects and other random issues, as your OS / file system may be busy with accidental Scala compilation, or Slack UI refreshes
- Be sure to actually consume the entire result set of each query in a loop, rather than just executing the query. Some databases may optimise for lazy cursor consumption (and possibly abortion). It would be unfair not to consume the entire result set
PostgreSQL
DO $$ DECLARE v_ts TIMESTAMP; v_repeat CONSTANT INT := 10000; rec RECORD; BEGIN -- Repeat the whole benchmark several times to avoid warmup penalty FOR i IN 1..5 LOOP v_ts := clock_timestamp(); FOR i IN 1..v_repeat LOOP FOR rec IN ( SELECT first_name, last_name, count(fa.actor_id) AS c FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id WHERE last_name LIKE 'A%' GROUP BY a.actor_id, first_name, last_name ORDER BY c DESC ) LOOP NULL; END LOOP; END LOOP; RAISE INFO 'Run %, Statement 1: %', i, (clock_timestamp() - v_ts); v_ts := clock_timestamp(); FOR i IN 1..v_repeat LOOP FOR rec IN ( SELECT first_name, last_name, ( SELECT count(*) FROM film_actor fa WHERE a.actor_id = fa.actor_id ) AS c FROM actor a WHERE last_name LIKE 'A%' ORDER BY c DESC ) LOOP NULL; END LOOP; END LOOP; RAISE INFO 'Run %, Statement 2: %', i, (clock_timestamp() - v_ts); END LOOP; END$$;
The result is:
INFO: Run 1, Statement 1: 00:00:01.708257 INFO: Run 1, Statement 2: 00:00:01.252012 INFO: Run 2, Statement 1: 00:00:02.33151 -- Slack message received here INFO: Run 2, Statement 2: 00:00:01.064007 INFO: Run 3, Statement 1: 00:00:01.638518 INFO: Run 3, Statement 2: 00:00:01.149005 INFO: Run 4, Statement 1: 00:00:01.670045 INFO: Run 4, Statement 2: 00:00:01.230755 INFO: Run 5, Statement 1: 00:00:01.81718 INFO: Run 5, Statement 2: 00:00:01.166089
As you can see, in all 5 benchmark executions, the version with the correlated subquery seemed to have outperformed the version with the LEFT JOIN in this case by roughly 60%! As this is PostgreSQL and open source, benchmark results are in actual seconds for 10000 query executions. Neat. Let’s move on to…
Oracle
SET SERVEROUTPUT ON DECLARE v_ts TIMESTAMP WITH TIME ZONE; v_repeat CONSTANT NUMBER := 10000; BEGIN -- Repeat the whole benchmark several times to avoid warmup penalty FOR r IN 1..5 LOOP v_ts := SYSTIMESTAMP; FOR i IN 1..v_repeat LOOP FOR rec IN ( SELECT first_name, last_name, count(fa.actor_id) AS c FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id WHERE last_name LIKE 'A%' GROUP BY a.actor_id, first_name, last_name ORDER BY c DESC ) LOOP NULL; END LOOP; END LOOP; dbms_output.put_line('Run ' || r || ', Statement 1 : ' || (SYSTIMESTAMP - v_ts)); v_ts := SYSTIMESTAMP; FOR i IN 1..v_repeat LOOP FOR rec IN ( SELECT first_name, last_name, ( SELECT count(*) FROM film_actor fa WHERE a.actor_id = fa.actor_id ) AS c FROM actor a WHERE last_name LIKE 'A%' ORDER BY c DESC ) LOOP NULL; END LOOP; END LOOP; dbms_output.put_line('Run ' || r || ', Statement 2 : ' || (SYSTIMESTAMP - v_ts)); END LOOP; END; /
Gee, check out the difference now (and remember, these are totally not seconds, but a hypothetical unit of measurment, let’s call them Newtons. Or Larrys. Let’s call them Larrys (great idea, Axel)):
Run 1, Statement 1 : 07.721731000 Run 1, Statement 2 : 00.622992000 Run 2, Statement 1 : 08.077535000 Run 2, Statement 2 : 00.666481000 Run 3, Statement 1 : 07.756182000 Run 3, Statement 2 : 00.640541000 Run 4, Statement 1 : 07.495021000 Run 4, Statement 2 : 00.731321000 Run 5, Statement 1 : 07.809564000 Run 5, Statement 2 : 00.632615000
Wow, the correlated subquery totally outperformed the LEFT JOIN query by an order of magnitude. This is totally insane. Now, check out…
SQL Server
… beautiful procedural language in SQL Server: Transact-SQL. With nice features like:
- Needing to cast INT values to VARCHAR when concatenating them.
- No indexed loop, only WHILE loop
- No implicit cursor loops (instead: DEALLOCATE!)
Oh well. It’s just for a benchmark. So here goes:
DECLARE @ts DATETIME; DECLARE @repeat INT = 10000; DECLARE @r INT; DECLARE @i INT; DECLARE @dummy1 VARCHAR; DECLARE @dummy2 VARCHAR; DECLARE @dummy3 INT; DECLARE @s1 CURSOR; DECLARE @s2 CURSOR; SET @r = 0; WHILE @r < 5 BEGIN SET @r = @r + 1 SET @s1 = CURSOR FOR SELECT first_name, last_name, count(fa.actor_id) AS c FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id WHERE last_name LIKE 'A%' GROUP BY a.actor_id, first_name, last_name ORDER BY c DESC SET @s2 = CURSOR FOR SELECT first_name, last_name, ( SELECT count(*) FROM film_actor fa WHERE a.actor_id = fa.actor_id ) AS c FROM actor a WHERE last_name LIKE 'A%' ORDER BY c DESC SET @ts = current_timestamp; SET @i = 0; WHILE @i < @repeat BEGIN SET @i = @i + 1 OPEN @s1; FETCH NEXT FROM @s1 INTO @dummy1, @dummy2, @dummy3; WHILE @@FETCH_STATUS = 0 BEGIN FETCH NEXT FROM @s1 INTO @dummy1, @dummy2, @dummy3; END; CLOSE @s1; END; DEALLOCATE @s1; PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 1: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms'; SET @ts = current_timestamp; SET @i = 0; WHILE @i < @repeat BEGIN SET @i = @i + 1 OPEN @s2; FETCH NEXT FROM @s2 INTO @dummy1, @dummy2, @dummy3; WHILE @@FETCH_STATUS = 0 BEGIN FETCH NEXT FROM @s2 INTO @dummy1, @dummy2, @dummy3; END; CLOSE @s2; END; DEALLOCATE @s2; PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 2: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms'; END;
And again, remember, these aren’t seconds. Really. They’re … Kilowatts. Yeah, let’s settle with kilowatts.
Run 1, Statement 1: 2626 Run 1, Statement 2: 20340 Run 2, Statement 1: 2450 Run 2, Statement 2: 17910 Run 3, Statement 1: 2706 Run 3, Statement 2: 18396 Run 4, Statement 1: 2696 Run 4, Statement 2: 19103 Run 5, Statement 1: 2716 Run 5, Statement 2: 20453
Oh my… Wait a second. Now suddenly, the correlated subquery is factor 5… more energy consuming (remember: kilowatts). Who would have thought?
Conclusion
This article won’t explain the differences in execution time between the different databases. There are a lot of reasons why a given execution plan will outperform another. There are also a lot of reasons why the same plan (at least what looks like the same plan) really isn’t because a plan is only a description of an algorithm. Each plan operation can still contain other operations that might still be different.
In summary, we can say that in this case (I can’t stress this enough. This isn’t a general rule. It only explains what happens in this case. Don’t create the next SQL myth!), the correlated subquery and the LEFT JOIN performed in the same order of magnitude on PostgreSQL (subquery being a bit faster), the correlated subquery drastically outperformed the LEFT JOIN in Oracle, whereas the LEFT JOIN drastically outperformed the correlated subquery in SQL Server (despite the plan having been the same!)
This means:
- Don’t trust your intitial judgment
- Don’t trust any historic blog posts saying A) is faster than B)
- Don’t trust execution plans
- Don’t trust this blog post here, because it is using uncomparable time scales (seconds vs newtons vs kilowatts)
- Don’t fully trust your own benchmarks, because you’re not measuring things as they happen in production
And sadly:
- Even for such a simple query, there’s no optimal query for all databases
(and I haven’t even included MySQL in the benchmarks)
BUT
by measuring two alternative, equivalent queries, you may just get an idea what might perform better for your system in case you do have a slow query somewhere. Perhaps this helps.
And now that you’re all hot on the subject, go book our 2 day SQL training, where we have tons of other interesting, myth busting content!
Filed under: sql Tagged: Benchmarking, Oracle, PostgreSQL, sql, SQL Benchmarking, SQL Performance, SQL Server
