sql group by 与不同

当查询中没有进行聚合时,为什么有人会使用 group by 和 distinct ?

Why would someone use a group by versus distinct when there are no aggregations done in the query?

此外,是否有人了解 MySQL 和 SQL Server 中的分组依据与不同的性能考虑因素.我猜 SQL Server 有一个更好的优化器,它们可能接近于等效的优化器,但在 MySQL 中,我预计会有显着的性能优势.

Also, does someone know the group by versus distinct performance considerations in MySQL and SQL Server. I'm guessing that SQL Server has a better optimizer and they might be close to equivalent there, but in MySQL, I expect a significant performance advantage to distinct.

我对 dba 的回答很感兴趣.

I'm interested in dba answers.

Bill 的帖子很有趣,但并不适用.让我更具体...

Bill's post is interesting, but not applicable. Let me be more specific...

select a, b, c 
from table x
group by a, b,c

对比

select distinct a,b,c
from table x

推荐答案

来自 MS SQL Server 的少量(非常少)经验数据,来自我们数据库的几个随机表.

A little (VERY little) empirical data from MS SQL Server, on a couple of random tables from our DB.

对于模式:

SELECT col1, col2 FROM table GROUP BY col1, col2

SELECT DISTINCT col1, col2 FROM table 

当查询没有覆盖索引时,两种方式都会产生以下查询计划:

When there's no covering index for the query, both ways produced the following query plan:

|--Sort(DISTINCT ORDER BY:([table].[col1] ASC, [table].[col2] ASC))
   |--Clustered Index Scan(OBJECT:([db].[dbo].[table].[IX_some_index]))

当有覆盖索引时,两者都产生:

and when there was a covering index, both produced:

|--Stream Aggregate(GROUP BY:([table].[col1], [table].[col2]))
   |--Index Scan(OBJECT:([db].[dbo].[table].[IX_some_index]), ORDERED FORWARD)

因此,从那个非常小的示例中,SQL Server 肯定会同等对待.

so from that very small sample SQL Server certainly treats both the same.

相关文章