MySQL - 选择不在 Group By 中的列

2021-11-20 00:00:00 group-by mysql

我正在尝试向预先存在的应用程序添加功能,但我遇到了如下所示的 MySQL 视图:

I'm trying to add features to a preexisting application and I came across a MySQL view something like this:

SELECT
     AVG(table_name.col1),
     AVG(table_name.col2),
     AVG(table_name.col3),
     table_name.personID,
     table_name.col4
FROM table_name
GROUP BY table_name.personID;

好的,所以有一些聚合函数.您可以选择 personID,因为您是按它分组的.但它也选择了不在聚合函数中且不属于 GROUP BY 子句的列.这怎么可能???它是否只是选择一个随机值,因为每个组的值肯定不是唯一的?

OK so there's a few aggregate functions. You can select personID because you're grouping by it. But it also is selecting a column that is not in an aggregate function and is not a part of the GROUP BY clause. How is this possible??? Does it just pick a random value because the values definitely aren't unique per group?

我来自哪里(MSSQL Server),这是一个错误.有人可以向我解释这种行为以及为什么它在 MySQL 中是允许的吗?

Where I come from (MSSQL Server), that's an error. Can someone explain this behavior to me and why it's allowed in MySQL?

推荐答案

确实,此功能允许一些不明确的查询,并以静默方式返回具有从该列中选取的任意值的结果集.在实践中,它往往是首先物理存储的组内行的值.

It's true that this feature permits some ambiguous queries, and silently returns a result set with an arbitrary value picked from that column. In practice, it tends to be the value from the row within the group that is physically stored first.

如果您只选择在功能上依赖于 GROUP BY 条件中的列的列,这些查询就不会产生歧义.换句话说,如果定义组的每个值只能有一个模糊"列的不同值,则没有问题.此查询在 Microsoft SQL Server(和 ANSI SQL)中是非法的,即使它在逻辑上不会导致歧义:

These queries aren't ambiguous if you only choose columns that are functionally dependent on the column(s) in the GROUP BY criteria. In other words, if there can be only one distinct value of the "ambiguous" column per value that defines the group, there's no problem. This query would be illegal in Microsoft SQL Server (and ANSI SQL), even though it cannot logically result in ambiguity:

SELECT AVG(table1.col1), table1.personID, persons.col4
FROM table1 JOIN persons ON (table1.personID = persons.id)
GROUP BY table1.personID;

此外,MySQL 有一个 SQL 模式使其按照标准运行:ONLY_FULL_GROUP_BY

Also, MySQL has an SQL mode to make it behave per the standard: ONLY_FULL_GROUP_BY

FWIW,SQLite 也允许这些不明确的 GROUP BY 子句,但它从组中的最后行中选择值.

FWIW, SQLite also permits these ambiguous GROUP BY clauses, but it chooses the value from the last row in the group.

至少在我测试的版本中是这样.任意意味着 MySQL 或 SQLite 将来可能会改变它们的实现,并有一些不同的行为.因此,在这种模棱两可的情况下,您不应该依赖当前的行为.最好将您的查询重写为确定性而不是模棱两可.这就是 MySQL 5.7 现在默认启用 ONLY_FULL_GROUP_BY 的原因.

At least in the version I tested. What it means to be arbitrary is that either MySQL or SQLite could change their implementation in the future, and have some different behavior. You should therefore not rely on the behavior staying they way it is currently in ambiguous cases like this. It's better to rewrite your queries to be deterministic and not ambiguous. That's why MySQL 5.7 now enables ONLY_FULL_GROUP_BY by default.

相关文章