CTE、子查询、临时表或表变量之间是否存在性能差异?
在这个优秀的SO 问题中,CTE
和 子查询
进行了讨论.
In this excellent SO question, differences between CTE
and sub-queries
were discussed.
我想问一下:
在什么情况下,以下各项更有效/更快?
In what circumstance is each of the following more efficient/faster?
- CTE
- 子查询
- 临时表
- 表变量
传统上,我在开发存储过程
时使用了很多临时表
- 因为它们看起来比许多相互交织的子查询更具可读性.
Traditionally, I've used lots of temp tables
in developing stored procedures
- as they seem more readable than lots of intertwined sub-queries.
Non-recursive CTE
s 很好地封装了数据集,并且非常具有可读性,但是在特定情况下可以说它们总是表现得更好吗?还是必须总是摆弄不同的选项才能找到最有效的解决方案?
Non-recursive CTE
s encapsulate sets of data very well, and are very readable, but are there specific circumstances where one can say they will always perform better? or is it a case of having to always fiddle around with the different options to find the most efficient solution?
编辑
最近有人告诉我,就效率而言,临时表是不错的首选,因为它们具有关联的直方图,即统计信息.
I've recently been told that in terms of efficiency, temporary tables are a good first choice as they have an associated histogram i.e. statistics.
推荐答案
SQL 是一种声明性语言,而不是一种过程性语言.也就是说,您构造一个 SQL 语句来描述您想要的结果.您没有告诉 SQL 引擎如何完成这项工作.
SQL is a declarative language, not a procedural language. That is, you construct a SQL statement to describe the results that you want. You are not telling the SQL engine how to do the work.
作为一般规则,最好让 SQL 引擎和 SQL 优化器找到最佳查询计划.开发 SQL 引擎需要很多人-年的努力,所以让工程师做他们知道如何做的事情.
As a general rule, it is a good idea to let the SQL engine and SQL optimizer find the best query plan. There are many person-years of effort that go into developing a SQL engine, so let the engineers do what they know how to do.
当然,也有查询计划不是最优的情况.然后你想使用查询提示、重构查询、更新统计、使用临时表、添加索引等以获得更好的性能.
Of course, there are situations where the query plan is not optimal. Then you want to use query hints, restructure the query, update statistics, use temporary tables, add indexes, and so on to get better performance.
至于你的问题.CTE 和子查询的性能理论上应该是相同的,因为它们都向查询优化器提供相同的信息.一个区别是多次使用的 CTE 可以很容易地识别和计算一次.然后可以多次存储和读取结果.不幸的是,SQL Server 似乎并没有利用这种基本的优化方法(你可以称之为常见的子查询消除).
As for your question. The performance of CTEs and subqueries should, in theory, be the same since both provide the same information to the query optimizer. One difference is that a CTE used more than once could be easily identified and calculated once. The results could then be stored and read multiple times. Unfortunately, SQL Server does not seem to take advantage of this basic optimization method (you might call this common subquery elimination).
临时表是另一回事,因为您提供了有关如何运行查询的更多指导.一个主要区别是优化器可以使用临时表中的统计信息来建立其查询计划.这可能会带来性能提升.此外,如果您有一个多次使用的复杂 CTE(子查询),那么将其存储在临时表中通常会提高性能.查询只执行一次.
Temporary tables are a different matter, because you are providing more guidance on how the query should be run. One major difference is that the optimizer can use statistics from the temporary table to establish its query plan. This can result in performance gains. Also, if you have a complicated CTE (subquery) that is used more than once, then storing it in a temporary table will often give a performance boost. The query is executed only once.
您的问题的答案是,您需要尝试获得预期的性能,尤其是对于定期运行的复杂查询.在理想的世界中,查询优化器会找到完美的执行路径.虽然它经常发生,但您或许能够找到一种方法来获得更好的性能.
The answer to your question is that you need to play around to get the performance you expect, particularly for complex queries that are run on a regular basis. In an ideal world, the query optimizer would find the perfect execution path. Although it often does, you may be able to find a way to get better performance.
相关文章