为什么 SQL 函数比 UDF 快

2021-09-10 00:00:00 user-defined-functions tsql sql-server

虽然这是一个相当主观的问题，但我觉得有必要在这个论坛上分享.

我个人经历过，当我创建一个 UDF(即使它并不复杂)并将其用于我的 SQL 时，它会大大降低性能.但是当我使用

将逻辑移动到内联表值函数中

CREATE FUNCTION dbo.F2 (@N BIGINT)退货表返回(选择@N - @N AS X)

并将查询重写为

SELECT MAX(X)发件人号码交叉应用 dbo.F2(N)

与不使用任何函数的原始查询一样快地执行.

Though it's a quite subjective question but I feel it necessary to share on this forum.

I have personally experienced that when I create a UDF (even if that is not complex) and use it into my SQL it drastically decrease the performance. But when I use SQL inbuild function they happen to work pretty faster. Conversion , logical & string functions are clear example of that.

So, my question is "Why SQL in build functions are faster than UDF"? and it would be an advantage if someone can guide me how can I judge/manipulate function cost either mathematically or logically.

解决方案

This is a well known issue with scalar UDFs in SQL Server.

They are not inlined into the plan and calling them adds overhead compared with having the same logic inline.

The following takes just under 2 seconds on my machine

WITH T10(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) --10 rows                                    
, T(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
           FROM T10 a, T10 b, T10 c, T10 d, T10 e, T10 f, T10 g)  -- 10 million rows
SELECT MAX(N - N)
FROM T
OPTION (MAXDOP 1)

Creating the simple scalar UDF

CREATE FUNCTION dbo.F1 (@N BIGINT)
RETURNS BIGINT 
WITH SCHEMABINDING
AS
BEGIN
RETURN (@N - @N)
END

And changing the query to MAX(dbo.F1(N)) instead of MAX(N - N) it takes around 26 seconds with STATISTICS TIME OFF and 37 with it on.

An average increase of 2.6μs / 3.7μs for each of the 10 million function calls.

Running the Visual Studio profiler shows that the vast majority of time is taken under UDFInvoke. The names of the methods in the call stack gives some idea of what the additional overhead is doing (copying parameters, executing statements, setting up security context).

Moving the logic into an inline table valued function

CREATE FUNCTION dbo.F2 (@N BIGINT)
RETURNS TABLE
RETURN(SELECT @N - @N AS X)

And rewriting the query as

SELECT MAX(X)
FROM Nums
CROSS APPLY dbo.F2(N)

executes in as fast as a time as the original query that does not use any functions.

相关文章