SQL 挑战/谜题:给定堆栈跟踪 - 如何在每个时间点找到顶部元素?

2021-12-28 00:00:00 sql hive teradata oracle sql-server
  • 我在现实生活中的用例是合并嵌套范围.我画了一些草图,然后我看到了堆栈 PUSH 和 POP 操作的开始和结束范围之间的相似之处.我明白解决这个问题也会解决原来的问题.
  • My real life use-case was to merge nested ranges. I've drew some sketches and then I saw the resemblance between ranges starting and ending to stack PUSH and POP operations. I understood that solving this problem will also solve the original problem.
  • op 列实际上可以从问题中删除.当 val 为 NULL 时,它是一个 POP 操作,否则它是一个 PUSH 操作.
  • The op column can actually be removed from the question. When val is NULL then it is a POP operation otherwise it is a PUSH operation.

一个表,stack_trace,包含以下列:

A table, stack_trace ,contains the following columns:

  • i - 表示时间点的整数值.
  • op - 2 种可能的操作:I(in"/push")和 O(out"/pop").
  • val - 由in"/push"操作插入的值,或者out"/pop"操作插入的值.

  • i - Integer value that represents a point in time.
  • op - 2 possible operations: I ("in"/"push") and O ("out"/"pop").
  • val - The value inserted by the "in"/"push" operation or NULL for "out"/"pop" operation.

目标是找出堆栈顶部的值,在每个时间点 (i).

The goal is to find what was the value at the top of the stack, at each point in time (i).

例如

(NULL 值在此处表示为空格)

(NULL values are represented here as empty spaces)

数据:

i   op  val 
--  --  --  
1   I   A   
2   I   B   
3   O
4   I   C
5   O    
6   O   

要求的结果:

i   top_of_stack_val
--  ----------------
1   A
2   B
3   A
4   C
5   A
6   

要求

  • 解决方案应该是单个 SQL 查询(子查询很好).
  • 只允许使用以下子句:SELECT、FROM、WHERE、GROUP BY、HAVING、ORDER BY.
  • 不允许使用WITH 子句(CTE - 公共表表达式).
  • 使用 T-SQL、PL/SQL 等.不允许.
  • 不允许使用 UDF(用户定义函数).
  • 不允许使用变量.
  • Requirements

    • The solution should be a single SQL query (sub-queries are fine).
    • Only the following clauses are allowed: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY.
    • The use of WITH clause (CTE - Common Table Expression) is not allowed.
    • The use of T-SQL, PL/SQL etc. is not allowed.
    • The use of UDF (User Defined Functions) is not allowed.
    • The use of variables is not allowed.
    • create table stack_trace
      (
          i       int
         ,op      char(1)
         ,val     char(1)
      )
      ;
      
      insert into stack_trace (i,op,val) values (1,'I','A');
      insert into stack_trace (i,op,val) values (2,'I','B');
      insert into stack_trace (i,op,val) values (3,'I','C');
      insert into stack_trace (i,op,val) values (4,'I','D');
      insert into stack_trace (i,op,val) values (5,'I','E');
      insert into stack_trace (i,op)     values (6,'O');
      insert into stack_trace (i,op)     values (7,'O');
      insert into stack_trace (i,op)     values (8,'O');
      insert into stack_trace (i,op,val) values (9,'I','F');
      insert into stack_trace (i,op)     values (10,'O');
      insert into stack_trace (i,op,val) values (11,'I','G');
      insert into stack_trace (i,op,val) values (12,'I','H');
      insert into stack_trace (i,op)     values (13,'O');
      insert into stack_trace (i,op)     values (14,'O');
      insert into stack_trace (i,op,val) values (15,'I','I');
      insert into stack_trace (i,op,val) values (16,'I','J');
      insert into stack_trace (i,op,val) values (17,'I','K');
      insert into stack_trace (i,op,val) values (18,'I','L');
      insert into stack_trace (i,op,val) values (19,'I','M');
      insert into stack_trace (i,op)     values (20,'O');
      insert into stack_trace (i,op,val) values (21,'I','N');
      insert into stack_trace (i,op)     values (22,'O');
      insert into stack_trace (i,op,val) values (23,'I','O');
      insert into stack_trace (i,op)     values (24,'O');
      insert into stack_trace (i,op,val) values (25,'I','P');
      insert into stack_trace (i,op)     values (26,'O');
      insert into stack_trace (i,op)     values (27,'O');
      insert into stack_trace (i,op,val) values (28,'I','Q');
      insert into stack_trace (i,op,val) values (29,'I','R');
      insert into stack_trace (i,op)     values (30,'O');
      insert into stack_trace (i,op)     values (31,'O');
      insert into stack_trace (i,op)     values (32,'O');
      insert into stack_trace (i,op)     values (33,'O');
      insert into stack_trace (i,op)     values (34,'O');
      insert into stack_trace (i,op)     values (35,'O');
      insert into stack_trace (i,op,val) values (36,'I','S');
      insert into stack_trace (i,op)     values (37,'O');
      insert into stack_trace (i,op)     values (38,'O');
      insert into stack_trace (i,op,val) values (39,'I','T');
      insert into stack_trace (i,op,val) values (40,'I','U');
      insert into stack_trace (i,op)     values (41,'O');
      insert into stack_trace (i,op,val) values (42,'I','V');
      insert into stack_trace (i,op,val) values (43,'I','W');
      insert into stack_trace (i,op,val) values (44,'I','X');
      insert into stack_trace (i,op)     values (45,'O');
      insert into stack_trace (i,op)     values (46,'O');
      insert into stack_trace (i,op,val) values (47,'I','Y');
      insert into stack_trace (i,op)     values (48,'O');
      insert into stack_trace (i,op)     values (49,'O');
      insert into stack_trace (i,op,val) values (50,'I','Z');
      insert into stack_trace (i,op)     values (51,'O');
      insert into stack_trace (i,op)     values (52,'O');
      

      要求的结果

      i   top_of_stack_val
      --  ----------------
      1   A
      2   B
      3   C
      4   D
      5   E
      6   D
      7   C
      8   B
      9   F
      10  B
      11  G
      12  H
      13  G
      14  B
      15  I
      16  J
      17  K
      18  L
      19  M
      20  L
      21  N
      22  L
      23  O
      24  L
      25  P
      26  L
      27  K
      28  Q
      29  R
      30  Q
      31  K
      32  J
      33  I
      34  B
      35  A
      36  S
      37  A
      38  
      39  T
      40  U
      41  T
      42  V
      43  W
      44  X
      45  W
      46  V
      47  Y
      48  V
      49  T
      50  Z
      51  T
      52  
      

      推荐答案

      就我个人而言,我怀疑您最终会找到可以在所有 SQL Server、Teradata、Postgres 和 Oracle 中使用并且具有可接受的性能的 SQL,如果桌子很大.

      Personally I doubt that you will end up finding SQL that you can just use in all of SQL Server, Teradata, Postgres, and Oracle and that has acceptable performance if the table is at all large.

      SQL Server 解决方案(demo)如下

      A SQL Server solution (demo) would be as follows

      SELECT i,
             SUBSTRING(MAX(FORMAT(i, 'D10') + val) OVER (PARTITION BY Pos ORDER BY i 
                                                           ROWS UNBOUNDED PRECEDING), 11, 8000)
      FROM   (SELECT st.*,
                     sum(CASE WHEN op = 'I' THEN 1 ELSE -1 END) 
                        OVER (ORDER BY i ROWS UNBOUNDED PRECEDING) AS pos
              FROM   stack_trace st) t1
      ORDER  BY i; 
      

相关文章