SQL - 如果满足使用多个先前列的条件,则 LAG 以获取先前值
我有一个由以下人员创建的表:
I have a table created by:
CREATE TABLE #test_table
(
id INT
,EventName VARCHAR(50)
,HomeTeam VARCHAR(25)
,Metric INT
)
INSERT INTO #test_table VALUES
(1, 'Team A vs Team B', 'Team A', 5),
(2, 'Team A vs Team B', 'Team A', 7),
(3, 'Team C vs Team D', 'Team C', 6),
(4, 'Team Z vs Team A', 'Team Z', 8),
(5, 'Team A vs Team B', 'Team A', 9),
(6, 'Team C vs Team D', 'Team C', 3),
(7, 'Team C vs Team D', 'Team C', 1),
(8, 'Team E vs Team F', 'Team E', 2)
结果:
id EventName HomeTeam Metric
------------------------------------------
1 Team A vs Team B Team A 5
2 Team A vs Team B Team A 7
3 Team C vs Team D Team C 6
4 Team Z vs Team A Team Z 8
5 Team A vs Team B Team A 9
6 Team C vs Team D Team C 3
7 Team C vs Team D Team C 1
8 Team E vs Team F Team E 2
A 想要计算一个新列 PreviousMetricN
,其中 N 可以是 1, 2, 3, ... 这显示了 Metric
的前一个值,但前提是HomeTeam
参与了之前的活动.例如:
A want to calculate a new column PreviousMetricN
where N can be 1, 2, 3, ... which shows the previous value for Metric
, but only if the HomeTeam
was involved in the previous event. For example:
id EventName HomeTeam Metric PreviousMetric1 PreviousMetric2
------------------------------------------------------------------------
1 Team A vs Team B Team A 5 NULL NULL
2 Team A vs Team B Team A 7 5 NULL
3 Team C vs Team D Team C 6 NULL NULL
4 Team Z vs Team A Team Z 8 NULL NULL
5 Team A vs Team B Team A 9 8 7
6 Team C vs Team D Team C 3 6 NULL
7 Team C vs Team D Team C 1 3 6
8 Team E vs Team F Team E 2 NULL NULL
我一直在尝试使用 PARTITION BY
子句中的新分组变量的 LAG
变体,例如
I have been trying variations of LAG
with a new grouping variable in the PARTITION BY
clause such as
LAG(Metric) OVER(Partition by (CASE WHEN CHARINDEX(HomeTeam, EventName)>0 THEN 1 ELSE 0 END) ORDER BY id)
但没有任何成功.这怎么办?
but without any success. How can this be done?
我也在这里为熊猫问过这个问题:Pandas shift - 如果满足多个条件,则获取之前的值
I've also asked this question for Pandas here: Pandas shift - get previous value if multiple conditions satisfied
推荐答案
我在这里看不到使用窗口函数和单次扫描表格的答案.我们可以在单次扫描中执行此查询,如下所示:
I see no answer here that uses window functions and a single scan of the table. We can do this query in a single scan as follows:
让我们假设您在另一列中有 AwayTeam
.
Let us assume you have the AwayTeam
in another column.
如果你还没有这个并且你想从 EventData
中解析它:
我们可以使用:SUBSTRING(EventData, CHARINDEX(' vs ', EventData) + 4)
我敦促您遵循适当的规范化并将其创建为表格中的适当列.
If you don't have this yet and you wanted to parse it out of
EventData
:
We could use:SUBSTRING(EventData, CHARINDEX(' vs ', EventData) + 4)
I urge you to follow proper normalization and create this as a proper column in your table.
我们的算法是这样运行的:
Our algorithm runs like this:
- 使用
CROSS APPLY
将两个团队相乘(逆透视)作为单独的行 - 使用
LAG
计算之前的Metric
,按合并后的Team
列进行分区 - 过滤掉翻倍的行,这样我们的原始行只有一行
- Multiply out (unpivot) the two teams as separate rows, using
CROSS APPLY
- Calculate the previous
Metric
s usingLAG
, partitioning by the mergedTeam
column - Filter back down the doubled up rows, so that we only get a single row for each of our original ones
SELECT id, HomeTeam, AwayTeam, Metric, Prev1, Prev2, Prev3
FROM (
SELECT *
,Prev1 = LAG(Metric, 1) OVER (PARTITION BY v.Team ORDER BY id)
,Prev2 = LAG(Metric, 2) OVER (PARTITION BY v.Team ORDER BY id)
,Prev3 = LAG(Metric, 3) OVER (PARTITION BY v.Team ORDER BY id)
-- more of these ......
FROM test_table
CROSS APPLY (VALUES (HomeTeam, 1),(AwayTeam, 0)) AS v(Team,IsHome)
) AS t
WHERE IsHome = 1
-- ORDER BY id --if necessary
重要的是,我们无需使用多种不同的排序、分区或排序,也无需使用自联接即可完成此操作.只需一次扫描.
Importantly, we can do this without the use of multiple different sorts, partitions or ordering, and without the use of a self-join. Just a single scan.
结果:
id | 家庭团队 | 客队 | 公制 | 上一页 | 上一个 | 上一个 |
---|---|---|---|---|---|---|
1 | A队 | B 队 | 5 | (空) | (空) | (空) |
2 | A队 | B 队 | 7 | 5 | (空) | (空) |
3 | C 组 | 团队 D | 6 | (空) | (空) | (空) |
4 | Z 团队 | A队 | 8 | (空) | (空) | (空) |
5 | A队 | B 队 | 9 | 8 | 7 | 5 |
6 | C 组 | 团队 D | 3 | 6 | (空) | (空) |
7 | C 组 | 团队 D | 1 | 3 | 6 | (空) |
8 | 团队 E | F 团队 | 2 | (空) | (空) | (空) |
相关文章