为什么 MySQL 允许“group by"?没有聚合函数的查询?

2021-11-20 00:00:00 sql mysql standards-compliance ansi-sql

惊奇——这是一个在 MySQL 中完全有效的查询:

Surprise -- this is a perfectly valid query in MySQL:

select X, Y from someTable group by X

如果您在 Oracle 或 SQL Server 中尝试此查询,您会收到自然错误消息:

If you tried this query in Oracle or SQL Server, you’d get the natural error message:

Column 'Y' is invalid in the select list because it is not contained in 
either an aggregate function or the GROUP BY clause.

那么 MySQL 是如何确定每个 X 显示哪个 Y 的呢?它只是选择一个.据我所知,它只是选择它找到的第一个 Y.基本原理是,如果 Y 既不是聚合函数也不是 group by 子句中,那么在查询中指定select Y"就毫无意义.因此,我作为数据库引擎会返回任何我想要的东西,你会喜欢的.

So how does MySQL determine which Y to show for each X? It just picks one. From what I can tell, it just picks the first Y it finds. The rationale being, if Y is neither an aggregate function nor in the group by clause, then specifying "select Y" in your query makes no sense to begin with. Therefore, I as the database engine will return whatever I want, and you’ll like it.

甚至还有一个 MySQL 配置参数来关闭这种松散".http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

There’s even a MySQL configuration parameter to turn off this "looseness". http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

这篇文章甚至提到了 MySQL 是如何因在这方面不符合 ANSI-SQL 标准而受到批评的.http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

This article even mentions how MySQL has been criticized for being ANSI-SQL non-compliant in this regard. http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

我的问题是:为什么 MySQL 是这样设计的?他们打破 ANSI-SQL 的理由是什么?

My question is: Why was MySQL designed this way? What was their rationale for breaking with ANSI-SQL?

推荐答案

我相信这是为了处理按一个字段分组意味着其他字段也被分组的情况:

I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

SELECT user.id, user.name, COUNT(post.*) AS posts 
FROM user 
  LEFT OUTER JOIN post ON post.owner_id=user.id 
GROUP BY user.id

在这种情况下,每个 user.id 的 user.name 将始终是唯一的,因此不需要 GROUP BY 子句中的 user.name 会很方便(尽管,正如您所说,是确定的问题范围)

In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BY clause (although, as you say, there is definite scope for problems)

相关文章