Quantcast
Channel: sql – Java, SQL and jOOQ.
Viewing all articles
Browse latest Browse all 426

MySQL Bad Idea #666

$
0
0

MySQL… We’ve blogged about MySQL before. Many times. We’ve shown bad ideas implemented in MySQL here:

But this beats everything. Check out this Stack Overflow question. It reads: “Why Oracle does not support ‘group by 1,2,3′?”. At first, I thought this user might have been confused because SQL allows for referencing columns by (1-based!) index in ORDER BY clauses:

SELECT first_name, last_name
FROM customers
ORDER BY 1, 2

The above is equivalent to ORDER BY first_name, last_name. The indexes 1, 2 refer to columns from the projection. This might be useful every now and then to avoid repeating complex column expressions, although it is probably a bit risky as you can change ordering semantics when adding a column to the SELECT clause.

But this user wanted to use the same syntax for the GROUP BY clause. And this actually works in MySQL! Check out the following query:

SELECT a, b
FROM (
  SELECT 'a' a, 'b' b, 'c' c UNION ALL
  SELECT 'a'  , 'c'  , 'c'   UNION ALL
  SELECT 'a'  , 'b'  , 'd'
) t
GROUP BY 1, 2
ORDER BY 2

See an example on SQLFiddle

The above yields…

| A | B |
|---|---|
| a | b |
| a | c |

But what would this even mean? According to our in-depth explanation of SQL clauses, the projection (SELECT clause) is logically evaluated after the GROUP BY clause. In other words, the columns defined in the SELECT clause are not yet in scope of the GROUP BY clause. Hence, the only reasonable semantics of column indexes would be the index from the table source t. But that’s not the case. Check out this alternative query:

SELECT a, b
FROM (
  SELECT 'a' a, 'b' b, 'c' c UNION ALL
  SELECT 'a'  , 'c'  , 'c'   UNION ALL
  SELECT 'a'  , 'b'  , 'd'
) t
GROUP BY 1, 2
ORDER BY 2

See an example on SQLFiddle

This now yields:

| B | C |
|---|---|
| b | c |
| c | c |
| b | d |

And it’s (probably?) the expected behaviour in MySQL as the documentation states:

Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1

The documentation actually treats GROUP BY in a very similar fashion as ORDER BY. For instance, it is possile to specify the ordering direction using GROUP BY only:

MySQL extends the GROUP BY clause so that you can also specify ASC and DESC after columns named in the clause:

SELECT a, COUNT(b) 
FROM test_table GROUP BY a DESC;

While we cannot figure out a reasonable edge-case that breaks this feature, we still think that there is something fishy about it. The fact that the SELECT clause is logically evaluated after the table source (FROM, WHERE, GROUP BY, HAVING), allowing the GROUP BY clause to reference it seems to lead to a weird understanding of SQL.

On the other hand, SQL is a very verbose language with little support for declaring reusable objects, short of common table expressions and the WINDOW clause. It is actually a bit surprising that the SQL standards folks would support this WINDOW clause to declare reusable window frames before introducing much more usable “common column expressions”, e.g:

-- Common column/table expressions:
WITH x AS CASE t1.a 
          WHEN 1 THEN 'a'
          WHEN 2 THEN 'b'
                 ELSE 'c'
          END, 
     y AS SOME_FUNCTION(t2.a, t2.b)
SELECT x, NVL(y, x)
FROM t1 JOIN t2 ON t1.id = t2.id
GROUP BY x, y
ORDER BY x DESC, y ASC

With common column expressions, reusing column expressions is independent of the SELECT clause itself. In other words, you can reuse column expressions in JOIN clauses, WHERE clauses, GROUP BY clauses, HAVING clauses, etc. without having to actually SELECT them.

So, to be fair with MySQL, while this feature is a non-feature in its current form, it provides a workaround for SQL’s verbosity.


Filed under: sql Tagged: common column expressions, Common Table Expressions, mysql, MySQL bad idea, sql, sql standard, sql syntax

Viewing all articles
Browse latest Browse all 426

Trending Articles