内联表值 UDF 能否胜过 SELECT 列列表中的等效标量 UDF?

2021-09-15 00:00:00 user-defined-functions sql-server

这个问题源于SQLServer:为什么避免使用表- 有价值的用户定义函数?.我开始在一些评论中提出问题，对我评论的回复偏离了主题.

This question grew out of SQLServer: Why avoid Table-Valued User Defined Functions?. I began asking questions in some of the comments, and the replies to my comments moved off topic.

这样您就不必阅读整个讨论:我从未听说过用户定义函数 (UDF) 很慢，或者应该避免使用.上面引用的问题中发布了一些链接，以说明它们很慢.我还是没明白，求个例子.贴了个例子，性能差别很大.

So that you don't have to read the entire discussion: I had never heard it said that user defined functions (UDF) were slow, or to be avoided. Some links were posted in the question referenced above to illustrate that they were slow. I still didn't get it, and asked for an example. An example was posted, and the performance difference was huge.

我不可能是唯一没有意识到可能存在如此大的性能差异的人.我觉得这个事实应该分成一个新的问题和答案，以提高被发现的机会.这就是问题".请不要关闭，因为我想给回答者时间来发布答案.

I can't be the only person who did not realize there could be such a large performance difference. I felt this fact should be separated into a new question and answer, to improve its chances of being found. This here is the "question". Please don't close yet, as I'd like to give the answerer time to post the answer.

当然，其他人也应该发布答案或示例.我特别感谢任何能帮助我理解为什么性能差异如此巨大的东西.

Of course, others should also post answers or examples, as well. I'd especially appreciate anything that would help me understand why the performance difference is so huge.

另请注意，我不是在谈论在 WHERE 子句中使用 UDF.我知道这会如何阻止优化器完成其工作.当原始 UDF 是 SELECT 列列表的一部分时，我对性能差异特别感兴趣.

Note also that I'm not talking about the use of a UDF in a WHERE clause. I'm aware of how this can prevent the optimizer from doing its job. I'm specifically interested in differences in performance when the original UDF was part of the SELECT column list.

推荐答案

为了进行基准测试，让我们创建一个 100 万行的表:

For benchmarking let's create a table with 1M rows:

CREATE TABLE dbo.Numbers(n INT NOT NULL PRIMARY KEY) GO DECLARE @i INT; SET @i = 1; INSERT INTO dbo.Numbers(n) SELECT 1; WHILE @i<1024000 BEGIN INSERT INTO dbo.Numbers(n) SELECT n + @i FROM dbo.Numbers; SET @i = @i * 2; END; GO

运行简单的内联添加:

SELECT COUNT(*) FROM( SELECT n,n+1 AS ValuePlusOne FROM dbo.Numbers ) AS t WHERE ValuePlusOne>0 CPU time = 15 ms, elapsed time = 122 ms. (1 row(s) affected) Table 'Numbers'. Scan count 1, logical reads 3521, physical reads 3, read-ahead reads 3498, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 406 ms, elapsed time = 951 ms.

创建一个标量 UDF，将一个整数加一，并运行 100 万次:

Create a scalar UDF that just adds one to an integer, and run it 1M times:

CREATE FUNCTION dbo.[AddOne] ( @value int ) RETURNS int AS BEGIN DECLARE @Result int SELECT @Result = @value + 1 RETURN @Result END GO SELECT COUNT(*) FROM( SELECT n,dbo.AddOne(n) AS ValuePlusOne FROM dbo.Numbers ) AS t WHERE ValuePlusOne>0 CPU time = 15 ms, elapsed time = 122 ms. (1 row(s) affected) Table 'Numbers'. Scan count 1, logical reads 3521, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 108313 ms, elapsed time = 295072 ms.

创建一个内联 UDF，它和添加一样快，并运行 100 万次:

Create an inline UDF, which is just as fast as just adding, and run that 1M times:

CREATE FUNCTION dbo.[AddOneInline] ( @value int ) RETURNS TABLE AS RETURN(SELECT @value + 1 AS ValuePlusOne) GO SELECT COUNT(*) FROM( SELECT ValuePlusOne FROM dbo.Numbers CROSS APPLY dbo.[AddOneInline](n) ) AS t WHERE ValuePlusOne>0 SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 35 ms. (1 row(s) affected) Table 'Numbers'. Scan count 1, logical reads 3521, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 391 ms, elapsed time = 403 ms.

标量 UDF 与内联 UDF 的性能差异很明显.

The difference in performance of a scalar UDF vs. an inline one is obvious.

相关文章