Quantcast
Viewing all articles
Browse latest Browse all 426

How to Execute a SQL Query Only if Another SQL Query has no Results

I stumbled upon an interesting question on Stack Overflow recently. A user wanted to query a table for a given predicate. If that predicate returns no rows, they wanted to run another query using a different predicate. Preferably in a single query.

Challenge accepted!

Canonical Idea: Use a Common Table Expression

We’re querying the Sakila database and we’re trying to find films of length 120 minutes. If there are no such films, then let’s find films of length 130 minutes. The following query is formally correct and runs without any adaptations on all of Oracle, PostgreSQL and SQL Server (and probably on other DBs too, as it’s pretty standard):

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

How does it work?

The common table expression (WITH clause) wraps the first query that we want to execute no matter what. We then select from the first query, and use UNION ALL to combine the result with the result of the second query, which we’re executing only if the first query didn’t yield any results (through NOT EXISTS). We’re hoping here that the database will be smart enough to run the existence check on a pre-calculated set from the first subquery, in order to be able to avoid running the second subquery.

Let’s see, which database actually does this.

PostgreSQL

Running EXPLAIN ANALYZE

EXPLAIN ANALYZE
WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

… we can see the following plan:

Append  (cost=68.50..137.26 rows=15 width=561) (actual time=0.052..0.300 rows=9 loops=1)
  CTE r
    ->  Seq Scan on film film_1  (cost=0.00..68.50 rows=9 width=394) (actual time=0.047..0.289 rows=9 loops=1)
          Filter: (length = 120)
          Rows Removed by Filter: 991
  ->  CTE Scan on r  (cost=0.00..0.18 rows=9 width=672) (actual time=0.051..0.297 rows=9 loops=1)
  ->  Result  (cost=0.02..68.52 rows=6 width=394) (actual time=0.002..0.002 rows=0 loops=1)
        One-Time Filter: (NOT $1)
        InitPlan 2 (returns $1)
          ->  CTE Scan on r r_1  (cost=0.00..0.18 rows=9 width=0) (actual time=0.000..0.000 rows=1 loops=1)
        ->  Seq Scan on film  (cost=0.00..68.50 rows=6 width=394) (never executed)
              Filter: (length = 130)
Planning time: 0.952 ms
Execution time: 0.391 ms

So, indeed, the database seems to be smart enough to avoid the second query, because the first one does yield 9 rows.

Can we see this in a benchmark as well? In principle, the complete query should take about as much time in a benchmark as the Common Table Expression alone. Here’s the benchmark logic:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r,
      (clock_timestamp() - v_ts);
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r,
      (clock_timestamp() - v_ts);
    RAISE INFO '';
  END LOOP;
END$$;

The result is:

INFO:  Run 1, Statement 1: 00:00:00.310325
INFO:  Run 1, Statement 2: 00:00:00.427744

INFO:  Run 2, Statement 1: 00:00:00.303202
INFO:  Run 2, Statement 2: 00:00:00.33568

INFO:  Run 3, Statement 1: 00:00:00.323699
INFO:  Run 3, Statement 2: 00:00:00.339835

INFO:  Run 4, Statement 1: 00:00:00.301084
INFO:  Run 4, Statement 2: 00:00:00.343838

INFO:  Run 5, Statement 1: 00:00:00.356343
INFO:  Run 5, Statement 2: 00:00:00.359891

As you can see, the second statement is consistently slower by around 5% – 10%. So we can safely say, the second subquery looking for length = 130 is not executed, but there’s still some overhead compared to making a decision in a client application to avoid that second subquery entirely. My guess here is that this is due to PostgreSQL’s Common Table Expression (CTE) being “optimisation fences”, i.e. the CTE is materialised every time. See also:
https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/

What about the inverse case?

In the above benchmark, we’ve measured how much time it takes when the first query succeeds (and the second query should be avoided). What about the inverse case, where the first query doesn’t match any rows and we have to run another query?

Benchmark time!

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 1200
      ) LOOP
        NULL;
      END LOOP;
      FOR rec IN (
        SELECT * FROM film WHERE length = 130
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r,
      (clock_timestamp() - v_ts);
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 1200
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r,
      (clock_timestamp() - v_ts);
    RAISE INFO '';
  END LOOP;
END$$;

The result is roughly the same:

INFO:  Run 1, Statement 1: 00:00:00.680222
INFO:  Run 1, Statement 2: 00:00:00.696036

INFO:  Run 2, Statement 1: 00:00:00.673141
INFO:  Run 2, Statement 2: 00:00:00.709034

INFO:  Run 3, Statement 1: 00:00:00.626873
INFO:  Run 3, Statement 2: 00:00:00.679469

INFO:  Run 4, Statement 1: 00:00:00.619584
INFO:  Run 4, Statement 2: 00:00:00.639092

INFO:  Run 5, Statement 1: 00:00:00.616275
INFO:  Run 5, Statement 2: 00:00:00.675317

A slight overhead in the single query case.

But what’s this? We didn’t even have an index on the LENGTH column. Let’s add one!

Now, the result is very different. Query 1 succeeds:

INFO:  Run 1, Statement 1: 00:00:00.055835
INFO:  Run 1, Statement 2: 00:00:00.093982

INFO:  Run 2, Statement 1: 00:00:00.038817
INFO:  Run 2, Statement 2: 00:00:00.084092

INFO:  Run 3, Statement 1: 00:00:00.041911
INFO:  Run 3, Statement 2: 00:00:00.078062

INFO:  Run 4, Statement 1: 00:00:00.039367
INFO:  Run 4, Statement 2: 00:00:00.081752

INFO:  Run 5, Statement 1: 00:00:00.039983
INFO:  Run 5, Statement 2: 00:00:00.081227

Query 1 fails:

INFO:  Run 1, Statement 1: 00:00:00.075469
INFO:  Run 1, Statement 2: 00:00:00.081766

INFO:  Run 2, Statement 1: 00:00:00.058276
INFO:  Run 2, Statement 2: 00:00:00.079613

INFO:  Run 3, Statement 1: 00:00:00.060492
INFO:  Run 3, Statement 2: 00:00:00.080672

INFO:  Run 4, Statement 1: 00:00:00.05877
INFO:  Run 4, Statement 2: 00:00:00.07936

INFO:  Run 5, Statement 1: 00:00:00.057584
INFO:  Run 5, Statement 2: 00:00:00.085798

Oracle

In Oracle, I couldn’t find any difference in execution speed (see below). The plan of a combined query also contains an element that prevents the execution of the second subquery. In this case, I’m using the /*+GATHER_PLAN_STATISTICS*/ hint to make sure we get actual execution values / times in our execution plan:

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT /*+GATHER_PLAN_STATISTICS*/ * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
);

SELECT p.*
FROM (
  SELECT *
  FROM v$sql
  WHERE upper(sql_text) LIKE '%LENGTH = 120%'
  ORDER BY last_active_time DESC
  FETCH NEXT 1 ROW ONLY
) s
CROSS APPLY TABLE(dbms_xplan.display_cursor(
  sql_id => s.sql_id,
  format => 'ALLSTATS LAST'
)) p;
---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      9 |
|   1 |  UNION-ALL          |      |      1 |        |      9 |
|*  2 |   TABLE ACCESS FULL | FILM |      1 |      7 |      9 |
|*  3 |   FILTER            |      |      1 |        |      0 |
|*  4 |    TABLE ACCESS FULL| FILM |      0 |      7 |      0 |
|*  5 |    TABLE ACCESS FULL| FILM |      1 |      2 |      1 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("LENGTH"=120)
   3 - filter( IS NULL)
   4 - filter("LENGTH"=130)
   5 - filter("LENGTH"=120)

While the estimates are off just as in PostgreSQL (an error that can propagate, see conclusion), the actual rows for the second subquery is zero, and the second subquery is run zero times (“Starts”), because we don’t have to really access it at all. Excellent. Exactly what we expected!

Here, I’ve finally created a benchmark that anonymises the results properly by normalising them in order to comply with Oracle’s forbidding of publishing benchmark results. The fastest execution time is simply 1, and the other execution times are multiples of that value:

SET SERVEROUTPUT ON
CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    INSERT INTO results VALUES (r, 1,
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    INSERT INTO results VALUES (r, 2,
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;

  FOR rec IN (
    SELECT
      run, stmt,
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(5, 4)) ratio
    FROM results
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run ||
      ', Statement ' || rec.stmt ||
      ' : ' || rec.ratio);
  END LOOP;
END;
/

DROP TABLE results;

The result being (query 1 succeeds, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.26901

Run 2, Statement 1 : 1.10218
Run 2, Statement 2 : 1.08792

Run 3, Statement 1 : 1.26038
Run 3, Statement 2 : 1.09426

Run 4, Statement 1 : 1.2245
Run 4, Statement 2 : 1.10829

Run 5, Statement 1 : 1.07164
Run 5, Statement 2 : 1.18562

Or in the inverse case (query 1 fails, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.17871

Run 2, Statement 1 : 1.07377
Run 2, Statement 2 : 1.12489

Run 3, Statement 1 : 1.05745
Run 3, Statement 2 : 1.13711

Run 4, Statement 1 : 1.11118
Run 4, Statement 2 : 1.23508

Run 5, Statement 1 : 1.08535
Run 5, Statement 2 : 1.11271

Adding an index doesn’t change much (query 1 succeeds):

Run 1, Statement 1 : 1.20699
Run 1, Statement 2 : 1.28221

Run 2, Statement 1 : 1
Run 2, Statement 2 : 1.21174

Run 3, Statement 1 : 1.0054
Run 3, Statement 2 : 1.2643

Run 4, Statement 1 : 1.0491
Run 4, Statement 2 : 1.31103

Run 5, Statement 1 : 1.02547
Run 5, Statement 2 : 1.23192

Yet, when query 1 fails:

Run 1, Statement 1 : 1.56287
Run 1, Statement 2 : 1.09471

Run 2, Statement 1 : 1.22219
Run 2, Statement 2 : 1.11227

Run 3, Statement 1 : 1.19739
Run 3, Statement 2 : 1.03929

Run 4, Statement 1 : 1.13503
Run 4, Statement 2 : 1

Run 5, Statement 1 : 1.14289
Run 5, Statement 2 : 1.01919

This time, the combined query is a bit faster!

As can be seen, both queries are executed in roughly the same time on Oracle 12c although again the single query seems to be a little bit slower, but not always. Which is an important reminder to do benchmarking properly! Meaning:

  • Repeat benchmarks several times
  • Beware of warmup penalties (the first run is often the slowest)
  • Beware of excessive caching effects in benchmarks
  • Don’t trust performance differences that aren’t significant
  • Don’t compile any Scala code or chat on Slack while benchmarking. Your system should be idle, otherwise
  • Remember to benchmark the right data set. We only have 600 films in this table. What would happen with 60 million films?

SQL Server

Same exercise again:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 2000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

DECLARE @results TABLE (
  run     INT,
  stmt    INT,
  elapsed DECIMAL
);

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR
    SELECT title FROM film WHERE length = 120;

  SET @s2 = CURSOR FOR
    WITH r AS (
      SELECT * FROM film WHERE length = 120
    )
    SELECT title FROM r
    UNION ALL
    SELECT title FROM film
    WHERE length = 130
    AND NOT EXISTS (
      SELECT * FROM r
    );

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  INSERT INTO @results VALUES (@r, 2, DATEDIFF(ms, @ts, current_timestamp));

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  INSERT INTO @results VALUES (@r, 1, DATEDIFF(ms, @ts, current_timestamp));
END;

SELECT 'Run ' + CAST(run AS VARCHAR) +
  ', Statement ' + CAST(stmt AS VARCHAR) +
  ': ' + CAST(CAST(elapsed / MIN(elapsed) OVER() AS DECIMAL(10, 5)) AS VARCHAR)
FROM @results;

The result, this time, is more drastic (no index, query 1 succeeds):

Run 1, Statement 1: 1.07292
Run 1, Statement 2: 1.35000

Run 2, Statement 1: 1.07604
Run 2, Statement 2: 1.40625

Run 3, Statement 1: 1.08333
Run 3, Statement 2: 1.40208

Run 4, Statement 1: 1.09375
Run 4, Statement 2: 1.34375

Run 5, Statement 1: 1.00000
Run 5, Statement 2: 1.46458

There is a 30% – 40% overhead for the CTE solution over the two query solution. If we don’t find any rows in the first query (no index):

Run 1, Statement 1: 1.08256
Run 1, Statement 2: 1.27546

Run 2, Statement 1: 1.16512
Run 2, Statement 2: 1.27778

Run 3, Statement 1: 1.00000
Run 3, Statement 2: 1.26235

Run 4, Statement 1: 1.04167
Run 4, Statement 2: 1.26003

Run 5, Statement 1: 1.05401
Run 5, Statement 2: 1.34259

… then the difference is slightly less drastic but still clear. The reason here is that SQL Server doesn’t avoid the unnecessary subquery:

Image may be NSFW.
Clik here to view.

Too bad! (Note I was using SQL Server 2014. Perhaps in 2016, this optimisation is implemented)

Note, you can trust me that adding an index doesn’t change much in this case.

Conclusion

We’ve seen that we can easily solve the original problem with SQL only: Select some data from a table using predicate A, and if we don’t find any data for predicate A, then try finding data using predicate B from the same table.

Oracle and PostgreSQL can both optimise away the unnecessary query 2 by inserting a “probe” in their execution plans that knows whether the query 2 needs to be executed or not. In Oracle, we’ve even seen a situation where the combined query outperforms two individual queries. SQL Server 2014 surprisingly does not have such an optimisation.

While the performance impact was negligible in all benchmarks (even in SQL Server), we should be careful with these kinds of queries and not entirely rely on the optimiser to “get it right”. In all three databases, the cardinality estimates were off. We’re working with small data sets, but if data sets grow larger, and queries like the above are embedded in more complex queries, then the wrong cardinality estimates can easily produce wrong execution plans (e.g. favouring hash join over nested loop joins because of a high number of estimated rows). An example of this was given in a previous blog post.

Nevertheless, we can get quite far with SQL, without resorting to procedural client languages and if I had conducted my benchmark with a JDBC client instead of procedural blocks directly inside of the database, perhaps the single query would have outperformed the double query case – at least in those cases where query 1 yielded no rows and query 2 had to be executed from a remote client. Probably in Oracle.

Ultimately, I can only repeat myself. Measure! Measure! Measure! There’s no point in guessing. Truth can only be found by measuring actual executions.


Filed under: sql Tagged: Common Table Expression, Oracle, PostgreSQL, sql, SQL Performance, SQL Server Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

Viewing all articles
Browse latest Browse all 426

Trending Articles