获取每组的前 1 行

我有一张表,我想获取每个组的最新条目.这是表:

I have a table which I want to get the latest entry for each group. Here's the table:

DocumentStatusLogs 表格

|ID| DocumentID | Status | DateCreated |
| 2| 1          | S1     | 7/29/2011   |
| 3| 1          | S2     | 7/30/2011   |
| 6| 1          | S1     | 8/02/2011   |
| 1| 2          | S1     | 7/28/2011   |
| 4| 2          | S2     | 7/30/2011   |
| 5| 2          | S3     | 8/01/2011   |
| 6| 3          | S1     | 8/02/2011   |

表格将按DocumentID 分组,并按DateCreated 降序排序.对于每个 DocumentID,我想获得最新状态.

The table will be grouped by DocumentID and sorted by DateCreated in descending order. For each DocumentID, I want to get the latest status.

我的首选输出:

| DocumentID | Status | DateCreated |
| 1          | S1     | 8/02/2011   |
| 2          | S3     | 8/01/2011   |
| 3          | S1     | 8/02/2011   |

  • 是否有任何聚合函数可以只从每个组中获取顶部?参见下面的伪代码 GetOnlyTheTop:

    SELECT
      DocumentID,
      GetOnlyTheTop(Status),
      GetOnlyTheTop(DateCreated)
    FROM DocumentStatusLogs
    GROUP BY DocumentID
    ORDER BY DateCreated DESC
    

  • 如果没有这样的功能,有什么办法可以实现我想要的输出吗?

  • If such function doesn't exist, is there any way I can achieve the output I want?

    有关详细信息,请参阅父表:

    Please see the parent table for more information:

    Current Documents 表格

    Current Documents Table

    | DocumentID | Title  | Content  | DateCreated |
    | 1          | TitleA | ...      | ...         |
    | 2          | TitleB | ...      | ...         |
    | 3          | TitleC | ...      | ...         |
    

    父表是否应该是这样,以便我可以轻松访问其状态?

    Should the parent table be like this so that I can easily access its status?

    | DocumentID | Title  | Content  | DateCreated | CurrentStatus |
    | 1          | TitleA | ...      | ...         | s1            |
    | 2          | TitleB | ...      | ...         | s3            |
    | 3          | TitleC | ...      | ...         | s1            |
    

    更新我刚刚学会了如何使用应用",它可以更轻松地解决此类问题.

    UPDATE I just learned how to use "apply" which makes it easier to address such problems.

    推荐答案

    ;WITH cte AS
    (
       SELECT *,
             ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
       FROM DocumentStatusLogs
    )
    SELECT *
    FROM cte
    WHERE rn = 1
    

    如果您希望每天有 2 个条目,那么这将任意选择一个.要获得一天的两个条目,请改用 DENSE_RANK

    If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead

    归一化与否,看你愿不愿意:

    As for normalised or not, it depends if you want to:

    • 在 2 个地方保持状态
    • 保留状态历史
    • ...

    就目前而言,您可以保留状态历史记录.如果您也想要父表中的最新状态(这是非规范化),您需要一个触发器来维护父表中的状态".或删除此状态历史记录表.

    As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.

相关文章