将连续的非零行标记为不同的分区?

2021-09-25 00:00:00 sql window-functions sql-server

假设我们有这个简单的架构和数据:

Suppose we have this simple schema and data:

DROP TABLE #builds
CREATE TABLE #builds (
    Id INT IDENTITY(1,1) NOT NULL,
    StartTime INT,
    IsPassed BIT
)
INSERT INTO #builds (StartTime, IsPassed) VALUES
(1, 1),
(7, 1),
(10, 0),
(15, 1),
(21, 1),
(26, 0),
(34, 0),
(44, 0),
(51, 1),
(60, 1)

SELECT StartTime, IsPassed, NextStartTime,
    CASE IsPassed WHEN 1 THEN 0 ELSE NextStartTime - StartTime END Duration
FROM (
    SELECT  
        LEAD(StartTime) OVER (ORDER BY StartTime) NextStartTime,
        StartTime, IsPassed
    FROM #builds
) x
ORDER BY StartTime

它产生以下结果集:

StartTime   IsPassed    NextStartTime   Duration
1           1           7               0
7           1           10              0
10          0           15              5
15          1           21              0
21          1           26              0
26          0           34              8
34          0           44              10
44          0           51              7
51          1           60              0
60          1           NULL            0

我需要总结非零连续 Duration 值,并在批处理中第一行的 StartTime 显示它们.IE.我需要解决这个问题:

I need to summarize the non zero consecutive Duration values and to show them at the StartTime of the first row in the batch. I.e. I need to get to this:

StartTime   Duration
10          5
26          25

我就是不知道该怎么做.

I just can't figure out how to do it.

PS:当然,真实表包含更多行.

PS: The real table contains many more rows, of course.

推荐答案

这是一个间隙和孤岛问题,需要将 IsPassed 恒定的每个部分划分到不同的组中.这可以通过计算整个表的 ROW_NUMBER() 与由 IsPassed 分区的差异来完成.然后,您可以 SUM 为每个组的 Duration 值,其中 IsPassed = False 并取 MIN(StartTime) 到给出组第一行的StartTime:

This is a gaps and islands problem, requiring partitioning each section where IsPassed is constant into a different group. That can be done by computing the difference between ROW_NUMBER() over the entire table against partitioned by IsPassed. You can then SUM the Duration Values for each group where IsPassed = False and take the MIN(StartTime) to give the StartTime of the first row of the group:

WITH CTE AS (
  SELECT StartTime, IsPassed,
         LEAD(StartTime) OVER (ORDER BY StartTime) AS NextStartTime
  FROM #builds
),
CTE2 AS (
  SELECT StartTime, IsPassed, NextStartTime,
         CASE IsPassed WHEN 1 THEN 0 ELSE NextStartTime - StartTime END Duration,
         ROW_NUMBER() OVER (ORDER BY StartTime) -
         ROW_NUMBER() OVER (PARTITION BY IsPassed ORDER BY StartTime) AS grp
  FROM CTE
)
SELECT MIN(StartTime) AS StartTime, SUM(Duration) AS Duration
FROM CTE2
WHERE IsPassed = 0
GROUP BY grp
ORDER BY MIN(StartTime)

输出:

StartTime   Duration
10          5
26          25

dbfiddle 演示

相关文章