A common problem with dynamic SQL is parsing performance in production. What makes matters worse is that many developers do not have access to production environments, so they are unaware of the problem (even if there’s nothing new about this topic). What exactly is the problem?
Execution plan caches
Most database vendors these days ship with an execution plan cache (Oracle calls it cursor cache), where previously parsed SQL statements are stored and their execution plan(s) is cached for reuse. This is the main reason why bind variables are so important (the other reason being SQL injection prevention). By using bind variables, we can make sure that the database will easily recognise an identical SQL statement from a previous execution and be able to re-execute the previously found execution plan. This is in fact one of my favourite topics from my SQL training.
Let’s see what happens in various databases, if we run the following queries:
-- First, run them with "inline values" or "constant literals" SELECT first_name, last_name FROM actor WHERE actor_id = 1; SELECT first_name, last_name FROM actor WHERE actor_id = 2; -- Then, run the same queries again with bind values SELECT first_name, last_name FROM actor WHERE actor_id = ?; SELECT first_name, last_name FROM actor WHERE actor_id = ?;
Note, it doesn’t matter if the queries are run from JDBC, jOOQ, Hibernate, or the procedural language in the database, e.g. PL/SQL, T-SQL, pgplsql. The result is always the same.
Let’s run an example
I’ll run the following examples using Oracle, only. Other databases behave in a similar fashion.
We’ll run the following script, which includes the above queries, and a query to fetch all the execution plans:
SELECT first_name, last_name FROM actor WHERE actor_id = 1; SELECT first_name, last_name FROM actor WHERE actor_id = 2; SET SERVEROUTPUT ON DECLARE v_first_name actor.first_name%TYPE; v_last_name actor.last_name%TYPE; BEGIN FOR i IN 1 .. 2 LOOP SELECT first_name, last_name INTO v_first_name, v_last_name FROM actor WHERE actor_id = i; dbms_output.put_line(v_first_name || ' ' || v_last_name); END LOOP; END; / SELECT s.sql_id, p.* FROM v$sql s, TABLE ( dbms_xplan.display_cursor ( s.sql_id, s.child_number, 'ALLSTATS LAST' ) ) p WHERE lower(s.sql_text) LIKE '%actor_id = %';
The output is:
SQL_ID 90rk04nhr45yz, child number 0 ------------------------------------- SELECT FIRST_NAME, LAST_NAME FROM ACTOR WHERE ACTOR_ID = :B1 Plan hash value: 457831946 --------------------------------------------------------- | Id | Operation | Name | E-Rows | --------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 | TABLE ACCESS BY INDEX ROWID| ACTOR | 1 | |* 2 | INDEX UNIQUE SCAN | PK_ACTOR | 1 | --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ACTOR_ID"=:B1) SQL_ID 283s8m524c9rk, child number 0 ------------------------------------- SELECT first_name, last_name FROM actor WHERE actor_id = 2 Plan hash value: 457831946 --------------------------------------------------------- | Id | Operation | Name | E-Rows | --------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 | TABLE ACCESS BY INDEX ROWID| ACTOR | 1 | |* 2 | INDEX UNIQUE SCAN | PK_ACTOR | 1 | --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ACTOR_ID"=2) SQL_ID 3mks715670mqw, child number 0 ------------------------------------- SELECT first_name, last_name FROM actor WHERE actor_id = 1 Plan hash value: 457831946 --------------------------------------------------------- | Id | Operation | Name | E-Rows | --------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 | TABLE ACCESS BY INDEX ROWID| ACTOR | 1 | |* 2 | INDEX UNIQUE SCAN | PK_ACTOR | 1 | --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ACTOR_ID"=1)
The plans are always the same and since we’re accessing primary key values, we’ll always get the same cardinalities, so there doesn’t seem to be anything wrong on each individual execution. But notice how the predicate information is slightly different. When querying for a constant value, then the predicate will include that value right there, whereas with the bind variable, we don’t know what the predicate value is, from the plan. This is perfectly expected, because we want to reuse that plan for both executions of the query.
With another query, we can see the number of executions of each statement:
SELECT sql_id, sql_text, executions FROM v$sql WHERE sql_id IN ( '90rk04nhr45yz', '283s8m524c9rk', '3mks715670mqw' );
SQL_ID SQL_TEXT EXECUTIONS ------------------------------------------------------------------------------------------ 90rk04nhr45yz SELECT FIRST_NAME, LAST_NAME FROM ACTOR WHERE ACTOR_ID = :B1 2 283s8m524c9rk SELECT first_name, last_name FROM actor WHERE actor_id = 2 1 3mks715670mqw SELECT first_name, last_name FROM actor WHERE actor_id = 1 1
This is where it gets more interesting. In the second case where we used a bind variable (which was generated by PL/SQL, automatically), we could reuse the statement, cache its plan, and run it twice.
Meh, does it matter?
It matters for two reasons:
- Performance of individual executions
- Performance of your entire system
How this affects individual executions
It seems very obvious that when being able to cache something, there’s a slight overhead to the cache maintenance compared to the gain in not having to do the work whose result is cached. The work in question here is parsing the SQL statement and creating an execution plan for it. Even if the plan is trivial, as in the above examples, there is overhead involved with calculating this plan.
This overhead can best be shown in a benchmark, a technique that I also display in my SQL training:
SET SERVEROUTPUT ON -- Don't run these on production -- But on your development environment, this guarantees clean caches ALTER SYSTEM FLUSH SHARED_POOL; ALTER SYSTEM FLUSH BUFFER_CACHE; CREATE TABLE results ( run NUMBER(2), stmt NUMBER(2), elapsed NUMBER ); DECLARE v_ts TIMESTAMP WITH TIME ZONE; v_repeat CONSTANT NUMBER := 2000; v_first_name actor.first_name%TYPE; v_last_name actor.last_name%TYPE; BEGIN -- Repeat whole benchmark several times to avoid warmup penalty FOR r IN 1..5 LOOP v_ts := SYSTIMESTAMP; FOR i IN 1..v_repeat LOOP BEGIN EXECUTE IMMEDIATE ' SELECT first_name, last_name FROM actor WHERE actor_id = ' || i -- Just fixing a syntax highlighting bug of this blog ' INTO v_first_name, v_last_name; EXCEPTION -- Please forgive me WHEN OTHERS THEN NULL; END; END LOOP; INSERT INTO results VALUES ( r, 1, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE); v_ts := SYSTIMESTAMP; FOR i IN 1..v_repeat LOOP BEGIN EXECUTE IMMEDIATE ' SELECT first_name, last_name FROM actor WHERE actor_id = :i' INTO v_first_name, v_last_name USING i; EXCEPTION -- Please forgive me WHEN OTHERS THEN NULL; END; END LOOP; INSERT INTO results VALUES ( r, 2, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE); END LOOP; FOR rec IN ( SELECT run, stmt, CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(10, 5)) ratio FROM results ) LOOP dbms_output.put_line('Run ' || rec.run || ', Statement ' || rec.stmt || ' : ' || rec.ratio); END LOOP; END; / DROP TABLE results;
As always, on the jOOQ blog, we don’t publish actual execution times to comply with license restrictions on benchmark publications, so we’re only comparing each execution with the fastest execution. This is the result of the above:
Run 1, Statement 1 : 83.39893 Run 1, Statement 2 : 1.1685 Run 2, Statement 1 : 3.02697 Run 2, Statement 2 : 1 Run 3, Statement 1 : 2.72028 Run 3, Statement 2 : 1.03996 Run 4, Statement 1 : 2.70929 Run 4, Statement 2 : 1.00866 Run 5, Statement 1 : 2.71895 Run 5, Statement 2 : 1.02198
We can see that consistently, the SQL version using a bind variable is 2.5x as fast as the one not using the bind variable. This overhead is very significant for trivial queries – it might be a bit less so for more complex queries, where the execution itself takes more time, compared to the parsing. But it should be obvious that the overhead is a price we do not want to pay. We want the query and its plan to be cached!
Notice also how the very first execution of the benchmark has a very significant overhead, because all of the 2000 queries will have been encountered for the first time before they’re cached for the second run. That’s a price we’re only paying during the first run, though.
How this affects your entire system
Not only do individual query executions suffer, your entire system does, too. After running the benchmark a few times, these are the execution statistics I’m getting from the Oracle cursor cache:
SELECT count(*), avg(executions), min(executions), max(executions) FROM v$sql WHERE lower(sql_text) LIKE '%actor_id = %' AND sql_text NOT LIKE '%v$sql%';
Yielding:
count avg min max 2001 9.9950 5 10000
There are now, currently, 2000 queries in my cache. The one that has been executed 10000 times (benchmark was repeated 5x and 2000 executions of the query per run), and 2000 queries that have been executed 5x (benchmark was repeated 5x).
If instead we run the query 20000 times (and remember, the query run corresponds to the filtered ACTOR_ID
), then the result will be vastly different!
Run 1, Statement 1 : 86.85862 Run 1, Statement 2 : 1.13546 Run 2, Statement 1 : 78.39842 Run 2, Statement 2 : 1.01298 Run 3, Statement 1 : 72.45254 Run 3, Statement 2 : 1 Run 4, Statement 1 : 73.78357 Run 4, Statement 2 : 2.24365 Run 5, Statement 1 : 84.89842 Run 5, Statement 2 : 1.143
Oh, my. Why has this happened? Let’s check again the cursor cache stats:
SELECT count(*), avg(executions), min(executions), max(executions) FROM v$sql WHERE lower(sql_text) LIKE '%actor_id = %' AND sql_text NOT LIKE '%v$sql%';
Yielding:
count avg min max 15738 3.4144 1 20000
This is a vastly different result. We don’t have all of our 20000 queries in the cursor cache, only some of them. This means that some statements have been purged from the cache to make room for new ones (which is reasonable behaviour for any cache).
But purging them is problematic too, because the way the benchmark was designed, they will re-appear again in the second, third, fourth, and fifth run, so we should have kept them in the cache. And since we’re executing every query the same number of times, there really wasn’t any way of identifying a “more reasonable” (i.e. rare) query to purge.
Resources in a system are always limited, and so is the cursor cache size. The more distinct queries we’re running in a system, the less they can profit from the cursor cache.
This is not a problem for rarely run queries, including reports, analytics, or some special queries run only by some very few users. But the queries that are being run all the time should always be cached.
I cannot stress enough how serious this can be:
In the above case, what should have been a single query in the cursor cache exploded into 20000 queries, shoving a lot of much more useful queries out of the cache. Not only does this slow down the execution of this particular query, it will purge tons of completely unrelated queries from the cache, thus slowing down the entire system by similar factors. If everyone is slowed down drastically, everyone will start to queue up to have their SQL queries parsed, and you can bring down your entire server with this problem (in the worst case)!
Workaround
Some databases support enforcing parsing the constant literals to bind variables. In Oracle, you can specify CURSOR_SHARING = FORCE
as a “quick fix”. In SQL Server, it’s called forced parametrization
.
But this approach has its own limitations and overhead, as this means that additional parsing work needs to be performed every time to recognise constant literals and replace them by bind variables. This overhead will then apply to all queries!
Conclusion
Bind variables are very important for SQL performance. After tons of training people to use them for SQL injection reasons (which is already a good thing), we have now seen how important they are also for performance reasons.
Not using a bind variable for values like IDs, timestamps, names, or anything that is uniformly distributed and has many values in your column will produce the above problem. The exception are bind variables for columns with only very few distinct values (like true/false flags, codes that encode a given state, etc.), in case of which a constant literal can be a reasonable option (follow-up blog post coming, soon).
But bind variables should always be your default choice. If you’re using a client-side tool like jOOQ or Hibernate, bind variables tend to be the default, and you’re fine. If you’re using a stored procedural language like PL/SQL or T-SQL, bind variables are generated automatically and you’re fine as well. But if you’re using JDBC or any JDBC wrapper like Spring’s JdbcTemplates, or any other string based API, like JPA’s native query API, then you are on your own again, and you must explicitly take care of using bind variables every time you have variable input.
And in our next article, we’ll see how bind variables are actually not enough, when using dynamic IN lists, another topic that I borrowed from my SQL training.