在 SQL Server 中,while 循环需要很长时间

2021-09-10 00:00:00 tsql sql-server

我有一个循环,它检查表 tmp13descript 列的所有名称和日期,并将它们作为单独的行存储在其他表中 (tmp14).问题是while循环执行了很长时间.我不知道如何让它运行得更快.

I have a loop that checks all the names and dates from descript column from table tmp13 and stores them as individual rows in other table (tmp14). The problem is that the while loop is executing for a long time. I don't know how to make it run faster.

这是我的代码,检查名称和日期的 descript 列.Descript 是一个文本列,可以有多个名称和日期.我想将这些名称和日期存储在单独的行中.

Here is my code checking the descript columns for names and date. Descript is a text column and can have multiple names and dates. I want to store those names and dates in separate rows.

DECLARE @Id INT
DECLARE @count INT
DECLARE @product_num INT
DECLARE @REQUESTED VARCHAR(50)
DECLARE @FirstDate VARCHAR(255)
DECLARE @RequestedBy VARCHAR(255)

DECLARE @name NVARCHAR(256)
DECLARE @date NVARCHAR(256)
DECLARE @desc NVARCHAR(256)

DECLARE @dateposition INT
DECLARE @nameposition INT
DECLARE @nameend INT

SELECT @count = MAX(id) 
FROM #TMP13

SET @id = 1;

WHILE (@id <= @count)
BEGIN
    SELCET @desc = descript FROM #TMP13 WHERE Id = @Id
    SELECT @product_num = p_Num FROM #TMP13 WHERE Id = @Id
    SELECT @REQUESTED = REQUESTED FROM #TMP13 WHERE Id = @Id
    SELECT @FirstDate = DATE1 FROM #TMP13 WHERE Id = @Id
    SELECT @RequestedBy = BY1 FROM #TMP13 WHERE Id = @Id


while (patindex('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@desc) > 0)
begin
    set @dateposition = patindex('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9]%',@desc)
    set @date = SUBSTRING(@desc,@dateposition,10)

    set @nameposition = CHARINDEX('-', @desc)+2
    set @nameend = CHARINDEX(' ', @desc, @nameposition)+1
    set @name = SUBSTRING(@desc,@nameposition,@nameend-@nameposition)

    insert into #TMP14 
    values (@Id,@product_num,@REQUESTED, @FirstDate ,@RequestedBY, @date, @name)

    set @desc = SUBSTRING(@desc,@nameend,1024)  
end

set @id = @id + 1;

end

select * from #tmp14;

---样本表

CREATE TABLE #Tmp13(
p_Num             INTEGER  NOT NULL PRIMARY KEY 
REQUESTED          varchar(50),
DATE1            VARCHAR(50),   
BY1              VARCHAR(50),
DESCRIPT         TEXT

);

INSERT INTO #tmp13( p_Num , REQUESTED, DATE1, BY1 , DESCRIPT) VALUES 
(100,'John','5/30/2017','James','05/30/2017 12:25am Eastern Standard Time - 
Mjames reported changes in the pages 05/30/2017 10:35AM JRachael agreed to 
work on the report and report to James 05/30/2017 10:00 AM James reports 
errors in page.',NULL);

INSERT INTO Table_Tasks(WO_NUM,Opendate,ClosedDate,Note) VALUES 
(200,'John','6/1/2017','Rachael','06/1/2017 3:20PM Eastern Standard Time -  
Rsubramaniam reported phone is not functional 06/1/2017 4:00PM Service took 
the phone and replaced it with new one');

OUTPUT

 Id  product_num REQUESTED FirstDate  RequestedBY date         name    date  
                                                                        Name 

 1   100          John      5/30/2017  james      5/30/2017 mjames  5/30/2017 jRachael

推荐答案

While 循环会很慢,因为当您使用过程逻辑在声明式编程语言中进行开发时会发生这种情况.为此,您可以使用 patternSplitCM.您的示例数据存在问题,但这应该可以满足您的需求.

While loops will be slow because that's what happens when you use procedural logic to develop in a declarative programming language. For this you can use patternSplitCM. There are issues with your sample data but this should get you what you need.

SELECT t.p_Num,
       t.REQUESTED,
       t.date1,
       t.BY1,
       parsedDate1 = MAX(CASE WHEN s.ItemNumber = 1 THEN s.Item END),
       parsedDate2 = MAX(CASE WHEN s.ItemNumber > 1 THEN s.Item END)
FROM   #Tmp13 AS t
CROSS 
APPLY    dbo.patternSplitCM(t.DESCRIPT, '[0-9/]') AS s
WHERE    s.[Matched] = 1 AND TRY_CAST(s.item AS DATE) IS NOT NULL
GROUP BY t.p_Num,t.REQUESTED, t.date1,t.BY1

结果:

p_Num       REQUESTED    date1        BY1       parsedDate1  parsedDate2
----------- ------------ ------------ --------- ------------ ------------
100         John         5/30/2017    James     05/30/2017   05/30/2017
200         John         6/1/2017     Rachael   06/1/2017    06/1/2017

最后,为了获得最佳性能,您希望 p_Num、REQUESTED、DATE1、BY1 列上的索引支持 GROUP BY 子句.索引看起来像:

Lastly, for optimal performance you want an index on the columns p_Num,REQUESTED,DATE1,BY1 to support the GROUP BY clause. The index would look something like:

CREATE UNIQUE NONCLUSTERED INDEX uq_xxx ON #Tmp13(p_Num,REQUESTED,DATE1,BY1) 
--INCLUDE (DESCRIPT); -- if you can change this to VARCHAR(8000) or VARCHAR(max)

另请注意,使用并行执行计划可能会运行得更快.为此,您可以使用 TRACEFLAG 8649 或 Adam Machanic 的 make_parallel().

Also note that this will likely run faster with a parallel execution plan. For that you can use TRACEFLAG 8649 or Adam Machanic's make_parallel().

相关文章