用 MySQL 分组时返回哪一行的字段?

2021-12-27 00:00:00 group-by mysql sql-order-by

我有一个包含 idstring 字段的 MySQL 表.id 是唯一的.strings 是 varchars 并且是非唯一的.

I have a MySQL table with the fields id and string. ids are unique. strings are varchars and are non-unique.

我执行以下查询:

SELECT id, string, COUNT( * ) AS frequency
FROM table
GROUP BY string
ORDER BY frequency DESC, id ASC

问题

假设表格包含三行,string 值相同,ids 1、2 和 3.

Assume the table contains three rows with identical string values, and ids 1, 2, and 3.

  1. 将返回哪个 id(1、2 或 3)?
  2. 哪个 id 是这个查询去 ORDER BY (与返回相同?...见问题 1)?
  3. 您能否控制返回/用于订购的id?例如.返回最大的 id,或来自 GROUP 的第一个 id.
  1. Which id is going to be returned ( 1, 2, or 3 )?
  2. Which id is this query going to ORDER BY ( Same as is returned? ... see question 1 )?
  3. Can you control which id is returned / used for ordering? eg. Return the largest id, or the first id from a GROUP.

我最终想要做的是获得相同字符串的出现频率,按该频率排序,从最高到最低,在频率关系上,按 id 和最小的 排序>id 来自返回/订购的组.我使这种情况更加通用,以弄清楚 MySQL 如何处理这种情况.

What I'm ultimately trying to do is get a frequency occurrence for identical strings, order by that frequency, highest to lowest, and on a frequency tie, order by id with the smallest id from the group returned / ordered by. I made the situation more generic to figure out how MySQL handles this situation.

推荐答案

将返回哪个 id(1、2 或 3)?

A:服务器将为所有具有相同名称的记录选择它想要的 id(很可能是获取最快的,这是不可预测的).引用官方文档:

A: The server will choose for all the records that have the same name the id it wants (most likely the fastest to fetch, which is unpredictable). To cite the official documentation:

服务器可以自由地从每个组中选择任何值,因此除非它们相同,否则选择的值是不确定的.

The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

此链接中的更多信息.

该查询将哪个 id 转到 ORDER BY(与返回的相同?...请参阅问题 1)?

找出检索到的数据将按什么顺序返回是没有意义的,因为您无法预测将获得的结果.但是,您很可能会得到按不可预测的 ID 列排序的结果.

It makes no sense to find out in what order the data retrieved will be returned as you can't predict the result you are going to get. However, it is very likely that you get the result sorted by the unpredictable ID column.

您能控制返回/用于订购的 ID 吗?例如.返回最大的 id,或来自 GROUP 的第一个 id.

此时您应该假设您不能.再次阅读文档.

You should be assuming at this point that you can't. Read again the documentation.

让事情更清楚:您无法预测不当使用 GROUP BY 子句的结果.MySQL 的主要问题是它允许您以非标准方式使用它,但您需要知道如何使用该功能.其背后的要点是按您知道将始终相同的字段分组.EG:

Making things even more clear: You can't predict the result of an improperly used GROUP BY clause. The main issue with MySQL is that it allows you to use it in a non-standard way but you need to know how to make use of that feature. The main point behind it is to group by fields that you know will always be the same. EG:

SELECT id, name, COUNT( * ) AS frequency
FROM table
GROUP BY id

在这里,您知道 name 将是唯一的,因为 id 在功能上决定了 name.所以你知道的结果是有效的.如果你也按名称分组,这个查询会更标准,但在 MySQL 中的表现会稍微差一些.

Here, you know name will be unique as id functionally determines name. So the result you know is valid. If you grouped also by name this query would be more standard but will perform slightly worse in MySQL.

最后要注意的是,根据我的经验,对于选定和非分组字段的那些非标准查询的结果通常是应用GROUP BY 然后是该字段上的 ORDER BY.这就是为什么它多次似乎起作用的原因.但是,如果您继续测试,您最终会发现这种情况的发生率为 95%.你不能依赖那个数字.

As a final note, take into account that, in my experience the results in those non-standard queries for the selected and non-grouped fields are usually the ones that you would get applying a GROUP BY and then an ORDER BY on that field. That is why so many times it seems to work. However, if you keep testing you will eventually find out that this happens 95% of the time. And you can not rely on that number.

相关文章