ANOTHER SQL Post this week? I got nerd-sniped:
Oooooh, challenge accepted!
So, let’s assume we have a table T
with columns (A, B, C)
like this:
WITH t(a, b, c) AS ( SELECT 'a', 'b', null FROM dual UNION ALL SELECT 'a', null, 'c' FROM dual UNION ALL SELECT 'a', 'b', 'c' FROM dual ) SELECT * FROM t
As expected, this yields:
A B C ----------------- a b a c a b c
Truly exciting.
Now we want to find all those rows that “match” either ('a', 'b', NULL)
or ('a', NULL, 'b')
. Clearly, this should produce the first two rows, right?
A B C ----------------- a b a c
Yes. Now the canonical solution would be to tediously write out the entire predicate as such:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE (a = 'a' AND b = 'b' AND c IS NULL) OR (a = 'a' AND b IS NULL AND c = 'c')
That’s really boring. Sure, we could have factored out the first, common predicate:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE a = 'a' AND ( (b = 'b' AND c IS NULL) OR (b IS NULL AND c = 'c') )
That’s certainly better from a performance perspective, but Rafael had a nifty idea. Let’s use row value expressions (tuples) in our predicates:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE (a, b, c) IN (('a', 'b', NULL), ('a', NULL, 'c'))
Unfortunately this doesn’t yield any results, because nothing is equal to NULL
in SQL (not even NULL
itself). The above query is the same as this one:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE (a = 'a' AND b = 'b' AND c = NULL /* oops */) OR (a = 'a' AND b = NULL /* oops */ AND c = 'c')
D’oh.
Solutions
The lame one
The canonical solution then would be a really lame (but perfectly valid) one. Encode NULL
to be some “impossible” string value. Rafael suggested yolo
. Fair enough.
This works:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE (a, NVL(b, 'yolo'), NVL(c, 'yolo')) IN (('a', 'b', 'yolo'), ('a', 'yolo', 'c'))
All we have to do now is to always remember the term yolo
which means “NULL
, but not NULL
thank you SQL”
The hipster one
But wait! SQL is painstakingly inconsistent when it comes to NULL
. See, NULL
really means UNKNOWN
in three-valued logic, and this means, we never know if SQL abides to its own rules.
Come in INTERSECT
. Like UNION
or EXCEPT
(MINUS
) in Oracle, as well as SELECT DISTINCT
, these set operations handle two NULL
values as NOT DISTINCT
. Yes, they’re not equal but also not distinct. Whatever. Just remember: That’s how it is
So, we can write this hipster solution to Rafael’s problem:
WITH t(a, b, c) AS (...) SELECT * FROM t WHERE EXISTS ( SELECT a, b, c FROM dual INTERSECT ( SELECT 'a', 'b', null FROM dual UNION ALL SELECT 'a', null, 'c' FROM dual ) )
We create an intersection of the tuple (a, b, c)
, the left side of Rafael’s IN
predicate, and the desired values on the right side of the IN
predicate, and we’re done.
Clearly less tedious than writing the original predicates, right? (We won’t look into performance this time)
Cheers, and a happy weekend.
Filed under: sql Tagged: EXISTS predicate, in predicate, INTERSECT, null, Oracle, Set operations, sql, Three-valued-logic
