一起使用 ORDER BY 和 GROUP BY

2021-11-20 00:00:00 sql group-by mysql greatest-n-per-group

我的表看起来像这样(而且我使用的是 MySQL):

My table looks like this (and I'm using MySQL):

m_id | v_id | timestamp
------------------------
6    |   1  | 1333635317
34   |   1  | 1333635323
34   |   1  | 1333635336
6    |   1  | 1333635343
6    |   1  | 1333635349

我的目标是对每个 m_id 取一次,并按最高时间戳排序.

My target is to take each m_id one time, and order by the highest timestamp.

结果应该是:

m_id | v_id | timestamp
------------------------
6    |   1  | 1333635349
34   |   1  | 1333635336

我写了这个查询:

SELECT * FROM table GROUP BY m_id ORDER BY timestamp DESC

但是,结果是:

m_id | v_id | timestamp
------------------------
34   |   1  | 1333635323
6    |   1  | 1333635317

我认为这是因为它首先执行 GROUP_BY 然后对结果进行排序.

I think it causes because it first does GROUP_BY and then ORDER the results.

有什么想法吗?谢谢.

推荐答案

一种正确使用 group by 的方法:

One way to do this that correctly uses group by:

select l.* 
from table l
inner join (
  select 
    m_id, max(timestamp) as latest 
  from table 
  group by m_id
) r
  on l.timestamp = r.latest and l.m_id = r.m_id
order by timestamp desc

这是如何工作的:

  • 为子查询中每个不同的m_id选择最新的时间戳
  • 仅从 table 中选择与子查询中的行匹配的行(此操作 - 执行连接,但未从第二个表中选择列,它仅用作过滤器 -- 被称为 semijoin" 以防你好奇)
  • 对行进行排序
  • selects the latest timestamp for each distinct m_id in the subquery
  • only selects rows from table that match a row from the subquery (this operation -- where a join is performed, but no columns are selected from the second table, it's just used as a filter -- is known as a "semijoin" in case you were curious)
  • orders the rows

相关文章