MySQL order by before group by

2021-11-20 00:00:00 group-by mysql sql-order-by

这里有很多类似的问题,但我认为没有人能充分回答这个问题.

There are plenty of similar questions to be found on here but I don't think that any answer the question adequately.

我将从当前最流行的问题继续,并使用他们的例子,如果没关系.

I'll continue from the current most popular question and use their example if that's alright.

此实例中的任务是获取数据库中每个作者的最新帖子.

The task in this instance is to get the latest post for each author in the database.

示例查询产生了无法使用的结果,因为它并不总是返回最新的帖子.

The example query produces unusable results as its not always the latest post that is returned.

SELECT wp_posts.* FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author           
    ORDER BY wp_posts.post_date DESC

当前接受的答案是

SELECT
    wp_posts.*
FROM wp_posts
WHERE
    wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
HAVING wp_posts.post_date = MAX(wp_posts.post_date) <- ONLY THE LAST POST FOR EACH AUTHOR
ORDER BY wp_posts.post_date DESC

不幸的是,这个答案是简单明了的错误,在许多情况下产生的结果不如原始查询稳定.

Unfortunately this answer is plain and simple wrong and in many cases produces less stable results than the orginal query.

我最好的解决方案是使用表单的子查询

My best solution is to use a subquery of the form

SELECT wp_posts.* FROM 
(
    SELECT * 
    FROM wp_posts
    ORDER BY wp_posts.post_date DESC
) AS wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author 

我的问题很简单:无论如何在分组之前对行进行排序而不诉诸子查询?

编辑:这个问题是另一个问题的延续,我的具体情况略有不同.您可以(并且应该)假设还有一个 wp_posts.id 是该特定帖子的唯一标识符.

Edit: This question was a continuation from another question and the specifics of my situation are slightly different. You can (and should) assume that there is also a wp_posts.id that is a unique identifier for that particular post.

推荐答案

在子查询中使用 ORDER BY 并不是解决此问题的最佳方法.

Using an ORDER BY in a subquery is not the best solution to this problem.

按作者获取 max(post_date) 的最佳解决方案是使用子查询返回最大日期,然后将其加入到 post_author 上的表中和最大日期.

The best solution to get the max(post_date) by author is to use a subquery to return the max date and then join that to your table on both the post_author and the max date.

解决方案应该是:

SELECT p1.* 
FROM wp_posts p1
INNER JOIN
(
    SELECT max(post_date) MaxPostDate, post_author
    FROM wp_posts
    WHERE post_status='publish'
       AND post_type='post'
    GROUP BY post_author
) p2
  ON p1.post_author = p2.post_author
  AND p1.post_date = p2.MaxPostDate
WHERE p1.post_status='publish'
  AND p1.post_type='post'
order by p1.post_date desc

如果您有以下示例数据:

If you have the following sample data:

CREATE TABLE wp_posts
    (`id` int, `title` varchar(6), `post_date` datetime, `post_author` varchar(3))
;

INSERT INTO wp_posts
    (`id`, `title`, `post_date`, `post_author`)
VALUES
    (1, 'Title1', '2013-01-01 00:00:00', 'Jim'),
    (2, 'Title2', '2013-02-01 00:00:00', 'Jim')
;

子查询将返回最大日期和作者:

The subquery is going to return the max date and author of:

MaxPostDate | Author
2/1/2013    | Jim

然后,由于您将其加入到表格中,因此您将在这两个值上返回该帖子的完整详细信息.

Then since you are joining that back to the table, on both values you will return the full details of that post.

参见SQL Fiddle with Demo.

扩展我关于使用子查询准确返回此数据的评论.

To expand on my comments about using a subquery to accurate return this data.

MySQL 不会强制您GROUP BY 包含在SELECT 列表中的每一列.因此,如果您只GROUP BY 一列但总共返回10 列,则无法保证返回属于post_author 的其他列值.如果该列不在 GROUP BY 中,MySQL 将选择应返回的值.

MySQL does not force you to GROUP BY every column that you include in the SELECT list. As a result, if you only GROUP BY one column but return 10 columns in total, there is no guarantee that the other column values which belong to the post_author that is returned. If the column is not in a GROUP BY MySQL chooses what value should be returned.

使用带有聚合函数的子查询将保证每次返回正确的作者和帖子.

Using the subquery with the aggregate function will guarantee that the correct author and post is returned every time.

作为旁注,虽然 MySQL 允许您在子查询中使用 ORDER BY 并允许您将 GROUP BY 应用于不是 中的每一列>SELECT 列出此行为在包括 SQL Server 在内的其他数据库中是不允许的.

As a side note, while MySQL allows you to use an ORDER BY in a subquery and allows you to apply a GROUP BY to not every column in the SELECT list this behavior is not allowed in other databases including SQL Server.

相关文章