MySQL… We’ve blogged about MySQL before. Many times. We’ve shown bad ideas implemented in MySQL here:
But this beats everything. Check out this Stack Overflow question. It reads: “Why Oracle does not support ‘group by 1,2,3′?”. At first, I thought this user might have been confused because SQL allows for referencing columns by (1-based!) index in ORDER BY
clauses:
SELECT first_name, last_name FROM customers ORDER BY 1, 2
The above is equivalent to ORDER BY first_name, last_name
. The indexes 1, 2
refer to columns from the projection. This might be useful every now and then to avoid repeating complex column expressions, although it is probably a bit risky as you can change ordering semantics when adding a column to the SELECT
clause.
But this user wanted to use the same syntax for the GROUP BY
clause. And this actually works in MySQL! Check out the following query:
SELECT a, b FROM ( SELECT 'a' a, 'b' b, 'c' c UNION ALL SELECT 'a' , 'c' , 'c' UNION ALL SELECT 'a' , 'b' , 'd' ) t GROUP BY 1, 2 ORDER BY 2
The above yields…
| A | B | |---|---| | a | b | | a | c |
But what would this even mean? According to our in-depth explanation of SQL clauses, the projection (SELECT
clause) is logically evaluated after the GROUP BY
clause. In other words, the columns defined in the SELECT
clause are not yet in scope of the GROUP BY
clause. Hence, the only reasonable semantics of column indexes would be the index from the table source t
. But that’s not the case. Check out this alternative query:
SELECT a, b FROM ( SELECT 'a' a, 'b' b, 'c' c UNION ALL SELECT 'a' , 'c' , 'c' UNION ALL SELECT 'a' , 'b' , 'd' ) t GROUP BY 1, 2 ORDER BY 2
This now yields:
| B | C | |---|---| | b | c | | c | c | | b | d |
And it’s (probably?) the expected behaviour in MySQL as the documentation states:
Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1
The documentation actually treats GROUP BY
in a very similar fashion as ORDER BY
. For instance, it is possile to specify the ordering direction using GROUP BY
only:
MySQL extends the GROUP BY clause so that you can also specify ASC and DESC after columns named in the clause:
SELECT a, COUNT(b) FROM test_table GROUP BY a DESC;
While we cannot figure out a reasonable edge-case that breaks this feature, we still think that there is something fishy about it. The fact that the SELECT
clause is logically evaluated after the table source (FROM, WHERE, GROUP BY, HAVING
), allowing the GROUP BY
clause to reference it seems to lead to a weird understanding of SQL.
On the other hand, SQL is a very verbose language with little support for declaring reusable objects, short of common table expressions and the WINDOW
clause. It is actually a bit surprising that the SQL standards folks would support this WINDOW
clause to declare reusable window frames before introducing much more usable “common column expressions”, e.g:
-- Common column/table expressions: WITH x AS CASE t1.a WHEN 1 THEN 'a' WHEN 2 THEN 'b' ELSE 'c' END, y AS SOME_FUNCTION(t2.a, t2.b) SELECT x, NVL(y, x) FROM t1 JOIN t2 ON t1.id = t2.id GROUP BY x, y ORDER BY x DESC, y ASC
With common column expressions, reusing column expressions is independent of the SELECT
clause itself. In other words, you can reuse column expressions in JOIN
clauses, WHERE
clauses, GROUP BY
clauses, HAVING
clauses, etc. without having to actually SELECT
them.
So, to be fair with MySQL, while this feature is a non-feature in its current form, it provides a workaround for SQL’s verbosity.
Filed under: sql Tagged: common column expressions, Common Table Expressions, mysql, MySQL bad idea, sql, sql standard, sql syntax
