为什么使用临时表比嵌套查询更快?

我们正在尝试优化我们的一些查询.

We are trying to optimize some of our queries.

一个查询正在执行以下操作:

One query is doing the following:

SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
INTO [#Gadget]
FROM task t

SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID) as Client 
FROM [#Gadget]
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC

DROP TABLE [#Gadget]

(我已经删除了复杂的子查询.除了解释为什么这个查询是作为一个两阶段的过程完成之外,我认为它没有意义.)

(I have removed the complex subquery. I don't think it's relevant other than to explain why this query has been done as a two stage process.)

我认为使用子查询将其合并为单个查询会更有效:

I thought it would be far more efficient to merge this down into a single query using subqueries as:

SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID)
FROM
(
    SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
    FROM task t
) as sub    
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC

这将为优化器提供更好的信息来确定正在发生的事情并避免任何临时表.我认为它应该更快.

This would give the optimizer better information to work out what was going on and avoid any temporary tables. I assumed it should be faster.

但事实证明它要慢得多.8 秒与 5 秒以下.

But it turns out it is a lot slower. 8 seconds vs. under 5 seconds.

我不明白为什么会这样,因为我对数据库的所有知识都表明子查询总是比使用临时表更快.

I can't work out why this would be the case, as all my knowledge of databases imply that subqueries would always be faster than using temporary tables.

我错过了什么?

编辑 --

从我从查询计划中看到的情况来看,两者基本相同,除了临时表有一个额外的表插入"操作,成本为 18%.

From what I have been able to see from the query plans, both are largely identical, except for the temporary table which has an extra "Table Insert" operation with a cost of 18%.

很明显,因为它有两个查询,所以第二个查询中Sort Top N的成本比Subquery方法中的Sort成本高很多,因此很难直接比较成本.

Obviously as it has two queries the cost of the Sort Top N is a lot higher in the second query than the cost of the Sort in the Subquery method, so it is difficult to make a direct comparison of the costs.

我从计划中看到的一切都表明子查询方法会更快.

Everything I can see from the plans would indicate that the subquery method would be faster.

推荐答案

显然,SQL Server 选择了错误的查询计划.是的,这可能会发生,我遇到过几次与您完全相同的情况.

Obviously, SQL Server is choosing the wrong query plan. Yes, that can happen, I've had exactly the same scenario as you a few times.

问题在于优化查询(您提到复杂子查询")是一项重要任务:如果您有 n 个表,那么大约有 n 个!可能的加入订单——这只是开始.因此,很可能 (a) 先执行内部查询,然后 (b) 然后执行外部查询是一个不错的方法,但 SQL Server 无法在合理的时间内推断出这些信息.

The problem is that optimizing a query (you mention a "complex subquery") is a non-trivial task: If you have n tables, there are roughly n! possible join orders -- and that's just the beginning. So, it's quite plausible that doing (a) first your inner query and (b) then your outer query is a good way to go, but SQL Server cannot deduce this information in reasonable time.

您可以做的是帮助 SQL Server.正如 Dan Tow 在他的好书SQL 调优",键通常是连接顺序,从选择性最高的表到选择性最少的表.使用常识(或他书中描述的方法,后者要好得多),您可以确定哪种连接顺序最合适,然后使用 FORCE ORDER 查询提示.

What you can do is to help SQL Server. As Dan Tow writes in his great book "SQL Tuning", the key is usually the join order, going from the most selective to the least selective table. Using common sense (or the method described in his book, which is a lot better), you could determine which join order would be most appropriate and then use the FORCE ORDER query hint.

无论如何,每个查询都是唯一的,没有让 SQL Server 更快的魔法按钮".如果您真的想知道发生了什么,您需要查看(或向我们展示)您的查询的查询计划.SET STATISTICS IO 显示了其他有趣的数据,它会告诉您您的查询产生多少(昂贵的)HDD 访问.

Anyway, every query is unique, there is no "magic button" to make SQL Server faster. If you really want to find out what is going on, you need to look at (or show us) the query plans of your queries. Other interesting data is shown by SET STATISTICS IO, which will tell you how much (costly) HDD access your query produces.

相关文章