There are some situations where you would like to have at least one (empty) row in your result set in SQL.
Imagine the following situation. We’re querying the Sakila database for actors and their films:
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (actor_id) JOIN film USING (film_id) ORDER BY 1, 2, 3
yielding something like:
+------------+-----------+---------------------+ | FIRST_NAME | LAST_NAME | TITLE | +------------+-----------+---------------------+ | ... | ... | ... | | ADAM | GRANT | SEABISCUIT PUNK | | ADAM | GRANT | SPLENDOR PATTON | | ADAM | GRANT | TADPOLE PARK | | ADAM | GRANT | TWISTED PIRATES | | ADAM | GRANT | WANDA CHAMBER | | ADAM | HOPPER | BLINDNESS GUN | | ADAM | HOPPER | BLOOD ARGONAUTS | | ADAM | HOPPER | CHAMBER ITALIAN | | ... | ... | ... | +------------+-----------+---------------------+
Now, let’s find actors called SUSAN, and in fact, let’s not care if they played in any films (I’ve added them to the Sakila database for the sake of the example):
SELECT actor_id, first_name, last_name, title FROM actor LEFT JOIN film_actor USING (actor_id) LEFT JOIN film USING (film_id) WHERE first_name = 'SUSAN' ORDER BY 1, 2, 3
Interesting, there are now two actors without any films:
+----------+------------+-----------+-----------------+ | ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE | +----------+------------+-----------+-----------------+ | 110 | SUSAN | DAVIS | TROJAN TOMORROW | | 110 | SUSAN | DAVIS | WASH HEAVENLY | | 110 | SUSAN | DAVIS | WORDS HUNTER | | 201 | SUSAN | DAVIS | | | 202 | SUSAN | SMITH | | +----------+------------+-----------+-----------------+
This worked, because I have changed the JOIN
type from INNER JOIN
to LEFT JOIN
. That’s neat. But what if we hadn’t found any actor called SUSAN
? What if we were looking for SUSANNE
instead?
SELECT actor_id, first_name, last_name, title FROM actor LEFT JOIN film_actor USING (actor_id) LEFT JOIN film USING (film_id) WHERE first_name = 'SUSANNE' ORDER BY 1, 2, 3
Empty. Void. Nothing:
+----------+------------+-----------+-----------------+ | ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE | +----------+------------+-----------+-----------------+
That’s fine in most cases, because the way we wrote this query, we’re expecting:
All actors called Susanne and their films, if any
But what if we wanted to have the same behaviour as we got for Films through LEFT JOIN
also with the actors? I.e. if we wanted this, instead (i.e. a collection with 1..N cardinality):
+----------+------------+-----------+-----------------+ | ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE | | | | | | +----------+------------+-----------+-----------------+
What would you need this for? Well, sometimes, we can simply not handle the depressing sadness of emptiness.
How to do this? We need another LEFT JOIN
prepended to the ACTOR
table, but not just to the ACTOR
table itself, we need to prepend it to everything. E.g. like this:
SELECT actor_id, first_name, last_name, title -- This dummy table will always generate exactly one row FROM ( SELECT 1 a ) a -- This is the original query, including the WHERE clause LEFT JOIN ( SELECT * FROM actor LEFT JOIN film_actor USING (actor_id) LEFT JOIN film USING (film_id) WHERE first_name = 'SUSANNE' ) b ON 1 = 1 ORDER BY 1, 2, 3
The above query is guaranteed to produce at least one row, because the left side of the LEFT JOIN
always produces exactly one row, which is joined to every row on the right side, if there is any row on the right side.
Caveats:
- The
WHERE
clause (and potentially other clauses, likeGROUP BY
) must now go inside of the new derived tableB
. Otherwise, we’d be removing that single row fromA
again usingWHERE
. (This is because of the order of SQL operations. We must ensureWHERE
“happens-before”LEFT JOIN
) - The
LEFT JOIN
betweenA
andB
needs anON
clause for syntactic reasons, even if we don’t really need that here. Just put something that is always true (likeTRUE
in PostgreSQL). - Our result now has an additional, useless column
A
, which might bother us, e.g. when usingSELECT *
Alternative: OUTER APPLY
If you’re using SQL Server or Oracle 12c, there’s an even more elegant solution using OUTER APPLY
:
SQL Server
SELECT actor_id, first_name, last_name, title FROM (SELECT 1 a) a OUTER APPLY ( SELECT a.actor_id, a.first_name, a.last_name, f.title FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id LEFT JOIN film f ON fa.film_id = f.film_id WHERE first_name = 'SUSANNE' ) b ORDER BY 1, 2, 3
Oracle 12c
SELECT actor_id, first_name, last_name, title FROM (SELECT 1 a FROM dual) a OUTER APPLY ( SELECT * FROM actor LEFT JOIN film_actor USING (actor_id) LEFT JOIN film USING (film_id) WHERE first_name = 'SUSANNE' ) b ORDER BY 1, 2, 3
While we don’t actually need the nice APPLY
feature here, it just allows us to omit the ON
clause and still have the LEFT OUTER
semantics. Neat ey?
Geek bonus
And if you really want to geek out on this functionality, consider using the dee table from the dum/dee PostgreSQL example. Remember, the dee
table is a table with exactly one row and no columns! This means we can use SELECT *
without getting this dummy row!
SELECT * -- This dummy table will always generate exactly one row FROM dee -- This is the original query, including the WHERE clause LEFT JOIN ( SELECT * FROM actor LEFT JOIN film_actor USING (actor_id) LEFT JOIN film USING (film_id) WHERE first_name = 'SUSANNE' ) b ON 1 = 1 ORDER BY 1, 2, 3
Ahh. Beautiful SQL!
Filed under: sql Tagged: 1..N Cardinality, APPLY, At least one row, LEFT JOIN, LEFT OUTER JOIN, OUTER APPLY, sql
