SQL DISTINCT is not a function

A very common misconception I often encounter with SQL users is the idea that DISTINCT is something like a function, and that it can take parenthesised arguments. Just recently, I’ve seen this Stack Overflow question where the OP was looking for a way to express this in jOOQ:

SELECT DISTINCT (emp.id), emp.fname, emp.name
FROM employee emp;

Notice the parentheses around (emp.id), which look as though this is some special kind of DISTINCT usage, which is akin to a DISTINCT function. The idea is often that:

The behaviour is somewhat different from omitting the parentheses
The performance is faster, because only the ID needs to be considered for distinctness

This is incorrect

These claims are incorrect, of course. There is no semantic or performance difference between the two. The parentheses are merely parentheses around a column expression, in a similar way as you would use parentheses to influence operator precedence. Think of it this way:

SELECT DISTINCT (emp.id + 1) * 2, emp.fname, emp.name
FROM employee emp;

In the above example, we do not apply a “DISTINCT function” to the expression emp.id + 1. We merely placed parentheses around a column expression emp.id + 1 to make sure the addition happens before the multiplication. The DISTINCT operator happens after the projection, always. If SQL had used a more logical syntax, rather than following English grammar (it was originally called Structured English QUEry Language, or SEQUEL), then we would write the OP’s statement like this:

FROM employee
SELECT id, fname, name
DISTINCT

Again, the DISTINCT operation always happens after the projection (SELECT clause content), and is applied to the entirety of the projection. There is no way in standard SQL to apply distinctness only to parts of the projection (there is in PostgreSQL, see further down).

To clarify this a bit more, I recommend reading our previous blog post about the logical order of operations in SQL, and how DISTINCT and ORDER BY are related.

What would it mean anyway?

We can revert the question back to the OP and ask ourselves, what would it mean for a DISTINCT operation to apply to only one column, anyway? Let’s assume this data set:

|id |fname|name|
|---|-----|----|
|1  |A    |A   |
|1  |B    |B   |

If we applied DISTINCT only to the ID column (and didn’t project anything else), clearly, we’d only get one row as a result:

SELECT DISTINCT id FROM employee

|id |
|---|
|1  |

But if we wanted to also project FNAME and NAME, which row would “win”? Would we display the first or the second row, or any random row? The behaviour would be undefined, and SQL doesn’t like undefined behaviour, so this is not possible. The only reasonable application of DISTINCT is always on the entire projection.

Exception: PostgreSQL

Fortunately (or to add more to syntactic confusion: unfortunately), PostgreSQL has implemented an extension to the SQL standard. With DISTINCT ON, it is effectively possible to apply distinctness only to parts of the projection:

WITH emp (id, fname, name) AS (
  VALUES (1, 'A', 'A'),
         (1, 'B', 'B')
)
SELECT DISTINCT ON (id) id, fname, name
FROM emp
ORDER BY id, fname, name

The output is now what the OP desired (but couldn’t use, because they were using MySQL):

|id         |fname|name |
|-----------|-----|-----|
|1          |A    |A    |

I personally don’t like DISTINCT ON. While it is very useful, no doubt, it makes something that is already very difficult to explain to SQL beginners even more complicated. With a “more reasonable” syntax, the query would be written like this:

FROM emp
SELECT id, fname, name
ORDER BY id, fname, name
DISTINCT ON (id)

With this syntactic order of operation, there would be no doubt about the semantics of DISTINCT or DISTINCT ON.

SQL DISTINCT is not a function

This is incorrect

What would it mean anyway?

Exception: PostgreSQL

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112