MySQL SELECT 按组最频繁

如何获取 MySQL 中每个标签最常出现的类别?理想情况下,我想模拟一个聚合函数来计算 mode列.

How do I get the most frequently occurring category for each tag in MySQL? Ideally, I would want to simulate an aggregate function that would calculate the mode of a column.

SELECT 
  t.tag 
  , s.category 
FROM tags t 
LEFT JOIN stuff s 
USING (id) 
ORDER BY tag;

+------------------+----------+
| tag              | category |
+------------------+----------+
| automotive       |        8 |
| ba               |        8 |
| bamboo           |        8 |
| bamboo           |        8 |
| bamboo           |        8 |
| bamboo           |        8 |
| bamboo           |        8 |
| bamboo           |       10 |
| bamboo           |        8 |
| bamboo           |        9 |
| bamboo           |        8 |
| bamboo           |       10 |
| bamboo           |        8 |
| bamboo           |        9 |
| bamboo           |        8 |
| banana tree      |        8 |
| banana tree      |        8 |
| banana tree      |        8 |
| banana tree      |        8 |
| bath             |        9 |
+-----------------------------+

推荐答案

SELECT t1.*
FROM (SELECT tag, category, COUNT(*) AS count
      FROM tags INNER JOIN stuff USING (id)
      GROUP BY tag, category) t1
LEFT OUTER JOIN 
     (SELECT tag, category, COUNT(*) AS count
      FROM tags INNER JOIN stuff USING (id)
      GROUP BY tag, category) t2
  ON (t1.tag = t2.tag AND (t1.count < t2.count 
      OR t1.count = t2.count AND t1.category < t2.category))
WHERE t2.tag IS NULL
ORDER BY t1.count DESC;

<小时>

我同意这对于单个 SQL 查询来说太过分了.任何在子查询中使用 GROUP BY 都会让我畏缩.您可以使用视图使其看起来更简单:


I agree this is kind of too much for a single SQL query. Any use of GROUP BY inside a subquery makes me wince. You can make it look simpler by using views:

CREATE VIEW count_per_category AS
    SELECT tag, category, COUNT(*) AS count
    FROM tags INNER JOIN stuff USING (id)
    GROUP BY tag, category;

SELECT t1.*
FROM count_per_category t1
LEFT OUTER JOIN count_per_category t2
  ON (t1.tag = t2.tag AND (t1.count < t2.count 
      OR t1.count = t2.count AND t1.category < t2.category))
WHERE t2.tag IS NULL
ORDER BY t1.count DESC;

但它基本上在幕后做同样的工作.

But it's basically doing the same work behind the scenes.

您评论说您可以在应用程序代码中轻松执行类似的操作.那你为什么不这样做呢?执行更简单的查询以获取每个类别的计数:

You comment that you could do a similar operation easily in application code. So why don't you do that? Do the simpler query to get the counts per category:

SELECT tag, category, COUNT(*) AS count
FROM tags INNER JOIN stuff USING (id)
GROUP BY tag, category;

并在应用程序代码中对结果进行排序.

And sort through the result in application code.

相关文章