MySQL 排序前分组依据

2021-11-20 00:00:00 group-by mysql wordpress sql-order-by

我需要找到每个作者的最新帖子,然后将结果分组,所以我只为每个作者找到一个最新帖子.

I need to find the latest post for each author and then group the results so I only a single latest post for each author.

SELECT wp_posts.* FROM wp_posts
        WHERE wp_posts.post_status='publish'
        AND wp_posts.post_type='post'
        GROUP BY wp_posts.post_author           
        ORDER BY wp_posts.post_date DESC

这是对输出的正确分组,所以我只能收到每个作者的一篇文章,但它是在分组之后而不是在选择之前对结果进行排序.

This is correctly grouping the output so I only get one post per author, but it is ordering the results after they have been grouped and not before they have been selected.

推荐答案

select wp_posts.* from wp_posts
where wp_posts.post_status='publish'and wp_posts.post_type='post'
按 wp_posts.post_author 分组
拥有 wp_posts.post_date = MAX(wp_posts.post_date)/* 只有每个作者的最后一个帖子 */
按 wp_posts.post_date desc 排序

经过一些评论后,我决定添加一些额外的信息.

After some comments I have decided to add some additional informations.

我工作的公司也使用 Postgres,尤其是 SQL Server.此数据库不允许此类查询.所以我知道还有其他方法可以做到这一点(我在下面写了一个解决方案).如果您不按投影中处理的所有列进行分组或不使用聚合函数,您还应该知道自己在做什么.否则就这样吧!

The company I am working at also uses Postgres and especially SQL Server. This databases don't allow such queries. So I know that there is a other way to do this (I write a solution below). You shoud also have to know what you do if you don't group by all columns treated in the projection or use aggregate functions. Otherwise let it be!

我选择了上面的解决方案,因为这是一个特定的问题.Tom 想获取 wordpress 站点中每位作者的最新帖子.在我看来,如果一个作者每秒发表一篇以上的文章,分析可以忽略不计.Wordpress 甚至应该通过它的垃圾邮件双重发布检测来禁止它.我从个人经验中知道,使用 MySQL 执行如此脏的组对性能有非常显着的好处.但是,如果您知道自己在做什么,那么您就可以做到!我在专业负责的应用程序中有这样的肮脏团体.在这里,我有一些带有 mio 行的表格,需要 5-15 秒而不是 100++ 秒.

I chose the solution above, because it's a specific question. Tom want to get the recent post for each author in a wordpress site. In my mind it is negligible for the analysis if a author do more than one post per second. Wordpress should even forbid it by its spam-double-post detection. I know from personal experience that there is a really significant benefit in performance doing a such dirty group by with MySQL. But if you know what you do, then you can do it! I have such dirty groups in apps where I'm professionally accountable for. Here I have tables with some mio rows which need 5-15s instead of 100++ seconds.

可能对一些优点和缺点有用:http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html

May be useful about some pros and cons: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html

SELECT
    wp_posts.*
FROM 
    wp_posts
    JOIN 
    (
        SELECT
            g.post_author
            MAX(g.post_date) AS post_date
        FROM wp_posts as g
        WHERE
            g.post_status='publish'
            AND g.post_type='post'
        GROUP BY g.post_author
    ) as t 
    ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date

ORDER BY wp_posts.post_date

但如果作者每秒发布多于一篇文章,您将获得多于一行而不是唯一的最后一篇.

But if here is more then one post per second for a author you will get more then one row and not the only last one.

现在您可以再次旋转轮子并获得具有最高 Id 的帖子.即使在这里,至少也不能保证你真的得到最后一个.

Now you can spin the wheel again and get the post with the highest Id. Even here it is at least not guaranteed that you really get the last one.

相关文章