SQL Server 2014:根据来自第三个表的值配对来自 2 个表的行

2021-09-10 00:00:00 tsql sql-server

我有 2 个表,其中包含一段时间内的类型化事件.

第一个表 #T1 包含的事件总是出现在第二个表 #T2 中的事件之前.

第三个表 #E 包含为事件定义分别出现在 #T1#T2 中的值的记录.>

示例数据:

创建表#T1(事件时间戳日期时间,VehicleId int,EventId varchar(50),事件值 varchar(50));创建表#T2(事件时间戳日期时间,VehicleId int,EventId varchar(50),事件值 varchar(50));创建表#E(EventId varchar(50),FirstValue int,LastValue varchar(50));INSERT INTO #T1(EventTimestamp, VehicleId, EventId, EventValue)值 (GETDATE(), 1, 'TwigStatus', '12'),(GETDATE(), 2, 'SafeProtectEvent', '5')INSERT INTO #T2(EventTimestamp, VehicleId, EventId, EventValue)值 (DATEADD(second, 30, GETDATE()), 1, 'TwigStatus', '7'),(DATEADD(second, 30, GETDATE()), 2, 'SafeProtectEvent', '6')插入 #E(EventId, FirstValue, LastValue)值 ('TwigStatus', '12', '7'),('SafeProtectEvent', '5', '6')声明 @EventId varchar(50) = 'TwigStatus';声明@FirstValue varchar(50) = '12';声明@LastValue varchar(50) = '7';使用命令 AS(选择首先,最后,EventNr = ROW_NUMBER() OVER (ORDER BY first)从(选择第一个 = t1.EventTimestamp,最后一个 = t2.EventTimestamp,rn = ROW_NUMBER() OVER (PARTITION BY t1.VehicleId ORDER BY t2.EventTimestamp)从#T1 t1内部联接#T2 t2 ON t2.EventTimestamp >t1.EventTimestampAND t2.EventValue = @LastValue在哪里t1.EventId = @EventId AND t1.EventValue = @FirstValue) ids在哪里rn = 1)选择t.VehicleId, o.first, o.last, t.EventId, t.EventValue从#T2吨内部联接ord o ON t.EventTimestamp >= o.firstAND t.EventTimestamp <= o.last;WHERE t.EventId = @EventId;删除表#E;删除表#T1;删除表#T2;

基本上,对于表 E 中的记录,您会看到对于 eventID 'TwigStatus',值 '12' 应首先出现在表 T1 中,然后是表 T2 中的下一个值 '7'.定义了第二个事件序列.

VehicleId 列是表 T1 和 T2 之间的链接.

我需要计算表 T1 和 T2 中两个匹配事件之间的延迟.

为了简单起见,我还没有使用表 E,我使用的是包含预定义值的变量并且我正在返回时间戳.

但是上面查询的结果;

VehicleId first last EventId EventValue1 2020-09-15 16:00:37.670 2020-09-15 16:01:07.670 TwigStatus 72 2020-09-15 16:00:37.670 2020-09-15 16:01:07.670 SafeProtectEvent 6

这不是我所期望的,因为现在应该过滤掉 EventId 'SafeProtectEvent'.

所以我有两个问题:

  1. 如何避免在实际查询中显示第二个事件.
  2. 如何处理表 E 的内容并摆脱变量来处理事件序列.


编辑 1:问题 1 通过对查询添加限制来解决(见上文)

解决方案

下面的更新/新版本 - 现在允许 T1 中的行而不匹配 T2 中的行.

根据对以下评论的讨论,我更新了此建议.

此代码替换了从 DECLARE @EventIdSELECT 语句结尾的所有内容.

逻辑如下——对于T1中的每一行...

  1. 确定 T1 中该行的时间边界(在其 EventTimestamp 与该车辆的 T1 中的下一个 EventTimestamp 之间;如果没有下一个事件,则为未来 1 天)
  2. 在 T2 中找到匹配的行,其中匹配"意味着 a) 相同的 VehicleId,b) 相同的 EventId,c) EventValue 受#E 中可能性的限制,并且 d) 发生在 T1 的时间边界内
  3. 查找这些行中的第一行(如果有)
  4. 将 EventDelay 计算为两个时间戳之间的时间

<代码>;与 t1 AS(选择车辆 ID,事件时间戳,事件 ID,事件值,COALESCE(LEAD(EventTimestamp, 1) OVER (PARTITION BY VehicleID ORDER BY EventTimestamp), DATEADD(day, 1, getdate())) AS NextT1_EventTimeStamp从#T1),订单AS(选择 t1.VehicleId,t1.EventTimestamp AS 首先,t2.EventTimestamp AS 最后,t1.EventId,t2.EventValue,ROW_NUMBER() OVER (PARTITION BY t1.VehicleId, t1.EventTimestamp, t1.EventId ORDER BY t2.EventTimestamp) AS rn从 t1LEFT OUTER JOIN #E AS e ON t1.EventId = e.EventIdAND t1.EventValue = e.FirstValue左外连接#T2 AS t2 ON t1.VehicleID = t2.VehicleIDAND t1.EventID = t2.EventIDAND t2.eventId = e.EventIdAND t2.EventValue = e.LastValueAND t2.EventTimestamp >t1.EventTimestampAND t2.EventTimestamp 

不断增长的DB<>fiddle 更新以及原始帖子和以前的建议.

I have 2 tables that contains typed events over time.

The first table #T1 contains events that always comes before events in the second table #T2.

A third table #E contains records that defines for an event the values that comes in #T1 and #T2 respectively.

Sample data:

CREATE TABLE #T1 
(
    EventTimestamp DateTime, 
    VehicleId int, 
    EventId varchar(50), 
    EventValue varchar(50)
);

CREATE TABLE #T2 
(
    EventTimestamp DateTime, 
    VehicleId int, 
    EventId varchar(50), 
    EventValue varchar(50)
);

CREATE TABLE #E 
(
     EventId varchar(50), 
     FirstValue int, 
     LastValue varchar(50)
);

INSERT INTO #T1(EventTimestamp, VehicleId , EventId, EventValue)
VALUES (GETDATE(), 1, 'TwigStatus', '12'),
       (GETDATE(), 2, 'SafeProtectEvent', '5')

INSERT INTO #T2(EventTimestamp, VehicleId , EventId, EventValue)
VALUES (DATEADD(second, 30, GETDATE()), 1, 'TwigStatus', '7'),
       (DATEADD(second, 30, GETDATE()), 2, 'SafeProtectEvent', '6')

INSERT INTO #E(EventId, FirstValue, LastValue)
VALUES ('TwigStatus', '12', '7'),
       ('SafeProtectEvent', '5', '6')

DECLARE @EventId varchar(50)  = 'TwigStatus';
DECLARE @FirstValue varchar(50) = '12';
DECLARE @LastValue varchar(50) = '7';

WITH ord AS 
(
    SELECT
        first, last,
        EventNr = ROW_NUMBER() OVER (ORDER BY first) 
    FROM
        (SELECT 
             first = t1.EventTimestamp, last = t2.EventTimestamp,
             rn = ROW_NUMBER() OVER (PARTITION BY t1.VehicleId ORDER BY t2.EventTimestamp) 
        FROM
            #T1 t1
        INNER JOIN 
            #T2 t2 ON t2.EventTimestamp > t1.EventTimestamp  
                   AND t2.EventValue = @LastValue
        WHERE 
            t1.EventId = @EventId AND t1.EventValue = @FirstValue) ids
    WHERE 
        rn = 1
)
SELECT
    t.VehicleId, o.first, o.last, t.EventId, t.EventValue 
FROM
    #T2 t
INNER JOIN 
    ord o ON t.EventTimestamp >= o.first 
          AND t.EventTimestamp <= o.last;
WHERE t.EventId = @EventId;    

DROP TABLE #E;
DROP TABLE #T1;
DROP TABLE #T2;

Basically, for a record in table E you see that for eventID 'TwigStatus' the value '12' should come first in table T1 and then '7' should be next in table T2. There is a second event sequence that is defined.

The VehicleId column is the link between the tables T1 and T2.

I need to compute the delay between two matching events in table T1 and T2.

To start simple, I do not use the table E yet, I'm using variables that contains predefined values and I'm returning timestamps.

But the result of the query above;

VehicleId   first                   last                    EventId             EventValue
1           2020-09-15 16:00:37.670 2020-09-15 16:01:07.670 TwigStatus          7
2           2020-09-15 16:00:37.670 2020-09-15 16:01:07.670 SafeProtectEvent    6

Is not what I'm expecting because the EventId 'SafeProtectEvent' Should be filtered out for now.

So I have 2 questions:

  1. How to avoid the second event to show with the actual query.
  2. How to work with the content of the table E and get rid of variables to process event sequences.


EDIT 1: Problem 1 Solved by adding a restriction on the query (see above)

解决方案

Update/new version below - now allows rows in T1 without matching rows in T2.

Based on discussion on comments below, I have updated this suggestion.

This code replaces everything from the DECLARE @EventId to the end of that SELECT statement.

Logic is as follows - for each row in T1 ...

  1. Determine the time boundaries for that row in T1 (between its EventTimestamp, and the next EventTimestamp in T1 for that vehicle; or 1 day in the future if there is no next event)
  2. Find the matching rows in T2, where 'matching' means a) same VehicleId, b) same EventId, c) EventValue is limited by possibilities in #E, and d) occurs within the time boundaries of T1
  3. Find the first of these rows, if available
  4. Calculate EventDelay as the times between the two timestamps

; WITH t1 AS
        (SELECT     VehicleId, 
                    EventTimestamp, 
                    EventId,
                    EventValue,
                    COALESCE(LEAD(EventTimestamp, 1) OVER (PARTITION BY VehicleID ORDER BY EventTimestamp), DATEADD(day, 1, getdate())) AS NextT1_EventTimeStamp
            FROM    #T1
        ),
   ord AS
        (SELECT     t1.VehicleId, 
                    t1.EventTimestamp AS first, 
                    t2.EventTimestamp AS last,
                    t1.EventId, 
                    t2.EventValue,
                    ROW_NUMBER() OVER (PARTITION BY t1.VehicleId, t1.EventTimestamp, t1.EventId ORDER BY t2.EventTimestamp) AS rn
            FROM    t1
                    LEFT OUTER JOIN #E  AS e ON t1.EventId = e.EventId 
                                       AND t1.EventValue = e.FirstValue
                    LEFT OUTER JOIN #T2 AS t2 ON t1.VehicleID = t2.VehicleID 
                                       AND t1.EventID = t2.EventID
                                       AND t2.eventId = e.EventId 
                                       AND t2.EventValue = e.LastValue
                                       AND t2.EventTimestamp > t1.EventTimestamp
                                       AND t2.EventTimestamp < NextT1_EventTimeStamp
        )
    SELECT      VehicleId, first, last, EventId, EventValue,
                DATEDIFF(second, first, last) AS EventDelay
        FROM    ord
        WHERE   rn = 1

The ever-growing DB<>fiddle has the latest updates, as well as original posts and previous suggestions.

相关文章