查找与 Min/Max 关联的行，没有内部循环

2021-09-10 00:00:00 sql tsql sql-server

我有一个关于 T-SQL 和 SQL Server 的问题.

I have a question related to T-SQL and SQL Server.

假设我有一个包含 2 列的 Orders 表:

Let's say I have a table Orders with 2 columns:

ProductId int
CustomerId int
日期日期时间

我想要每个产品的第一个订单的日期，所以我执行这种类型的查询:

I want the date of the first order for every product, so I perform this type of query:

SELECT ProductId, MIN(Date) AS FirstOrder FROM Orders GROUP BY ProductId

我在 ProductId 上有一个索引，包括列 CustomerId 和 Date 以加快查询速度 (IX_Orders>).查询计划看起来像是对 IX_Orders 的非聚集索引扫描，然后是流聚合(由于索引没有排序).

I have an index on ProductId, including the columns CustomerId and Date to speed up the query (IX_Orders). The query plan looks like a non-clustered index scan on IX_Orders, followed by a stream aggregate (no sort thanks to the index).

现在我的问题是我还想检索与每个产品的第一个订单相关联的 CustomerId(产品 26 于 25 日星期二由客户 12 首次订购).棘手的部分是我不希望在执行计划中有任何内部循环，因为这意味着表中每个 ProductId 的额外读取，这是非常低效的.

Now my problem is that I also want to retrieve the CustomerId associated with the first order for each product (Product 26 was first ordered on Tuesday 25, by customer 12). The tricky part is that I don't want any inner loop in the execution plan, because it would mean an additional read per ProductId in the table, which is highly inefficient.

这应该可以使用相同的非聚集索引扫描，然后是流聚合，但是我似乎找不到可以做到这一点的查询.有什么想法吗?

This should just be possible using the same non-clustered index scan, followed by stream aggregates, however I can't seem to find a query that would do that. Any idea?

谢谢

推荐答案

这将处理具有重复日期的产品:

this will handle products that have duplicate dates:

DECLARE @Orders table (ProductId int ,CustomerId int ,Date datetime ) INSERT INTO @Orders VALUES (1,1,'20090701') INSERT INTO @Orders VALUES (2,1,'20090703') INSERT INTO @Orders VALUES (3,1,'20090702') INSERT INTO @Orders VALUES (1,2,'20090704') INSERT INTO @Orders VALUES (4,2,'20090701') INSERT INTO @Orders VALUES (1,3,'20090706') INSERT INTO @Orders VALUES (2,3,'20090704') INSERT INTO @Orders VALUES (4,3,'20090702') INSERT INTO @Orders VALUES (5,5,'20090703') --duplicate dates for product #5 INSERT INTO @Orders VALUES (5,1,'20090703') --duplicate dates for product #5 INSERT INTO @Orders VALUES (5,5,'20090703') --duplicate dates for product #5 ;WITH MinOrders AS (SELECT o.ProductId, o.CustomerId, o.Date ,row_number() over(partition by o.ProductId order by o.ProductId,o.CustomerId) AS RankValue FROM @Orders o INNER JOIN (SELECT ProductId ,MIN(Date) MinDate FROM @Orders GROUP BY ProductId ) dt ON o.ProductId=dt.ProductId AND o.Date=dt.MinDate ) SELECT m.ProductId, m.CustomerId, m.Date FROM MinOrders m WHERE m.RankValue=1 ORDER BY m.ProductId, m.CustomerId

这将返回相同的结果，只需使用与上述代码相同的声明和插入:

this will return the same results, just use the same declare and inserts as the above code:

;WITH MinOrders AS (SELECT o.ProductId, o.CustomerId, o.Date ,row_number() over(partition by o.ProductId order by o.ProductId,o.CustomerId) AS RankValue FROM @Orders o ) SELECT m.ProductId, m.CustomerId, m.Date FROM MinOrders m WHERE m.RankValue=1 ORDER BY m.ProductId, m.CustomerId

您可以尝试每个版本，看看哪个版本运行得更快...

You can try out each version to see which will run faster...

相关文章