连接/聚合字符串的最佳方式

我正在寻找一种将不同行中的字符串聚合为一行的方法.我希望在许多不同的地方这样做,所以有一个功能来促进这一点会很好.我已经尝试过使用 COALESCEFOR XML 的解决方案,但他们只是不适合我.

I'm finding a way to aggregate strings from different rows into a single row. I'm looking to do this in many different places, so having a function to facilitate this would be nice. I've tried solutions using COALESCE and FOR XML, but they just don't cut it for me.

字符串聚合会做这样的事情:

String aggregation would do something like this:

id | Name                    Result: id | Names
-- - ----                            -- - -----
1  | Matt                            1  | Matt, Rocks
1  | Rocks                           2  | Stylus
2  | Stylus

我看过CLR 定义的聚合函数 作为 COALESCEFOR XML 的替代品,但显然 SQL Azure 确实可以不支持 CLR 定义的东西,这对我来说很痛苦,因为我知道能够使用它会为我解决很多问题.

I've taken a look at CLR-defined aggregate functions as a replacement for COALESCE and FOR XML, but apparently SQL Azure does not support CLR-defined stuff, which is a pain for me because I know being able to use it would solve a whole lot of problems for me.

是否有任何可能的解决方法或类似的最佳方法(可能不如 CLR 最佳,但是嘿我会用我能得到的)来聚合我的东西?

Is there any possible workaround, or similarly optimal method (which might not be as optimal as CLR, but hey I'll take what I can get) that I can use to aggregate my stuff?

推荐答案

解决方案

optimal 的定义可能会有所不同,但这里介绍了如何使用常规 Transact SQL 连接来自不同行的字符串,这在 Azure 中应该可以正常工作.

The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.

;WITH Partitioned AS
(
    SELECT 
        ID,
        Name,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber,
        COUNT(*) OVER (PARTITION BY ID) AS NameCount
    FROM dbo.SourceTable
),
Concatenated AS
(
    SELECT 
        ID, 
        CAST(Name AS nvarchar) AS FullName, 
        Name, 
        NameNumber, 
        NameCount 
    FROM Partitioned 
    WHERE NameNumber = 1

    UNION ALL

    SELECT 
        P.ID, 
        CAST(C.FullName + ', ' + P.Name AS nvarchar), 
        P.Name, 
        P.NameNumber, 
        P.NameCount
    FROM Partitioned AS P
        INNER JOIN Concatenated AS C 
                ON P.ID = C.ID 
                AND P.NameNumber = C.NameNumber + 1
)
SELECT 
    ID,
    FullName
FROM Concatenated
WHERE NameNumber = NameCount

说明

该方法归结为三个步骤:

The approach boils down to three steps:

  1. 使用OVERPARTITION 对行进行编号,并根据需要对它们进行串联排序.结果是Partitioned CTE.我们保留每个分区中的行数,以便稍后过滤结果.

  1. Number the rows using OVER and PARTITION grouping and ordering them as needed for the concatenation. The result is Partitioned CTE. We keep counts of rows in each partition to filter the results later.

使用递归 CTE(Concatenated)遍历行号(NameNumber 列),将 Name 值添加到 FullName 列.

Using recursive CTE (Concatenated) iterate through the row numbers (NameNumber column) adding Name values to FullName column.

过滤除NameNumber 最高的所有结果.

Filter out all results but the ones with the highest NameNumber.

请记住,为了使此查询可预测,必须同时定义分组(例如,在您的场景中,具有相同 ID 的行被连接)和排序(我假设您只需在串联之前按字母顺序对字符串进行排序).

Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).

我已经使用以下数据在 SQL Server 2012 上快速测试了该解决方案:

I've quickly tested the solution on SQL Server 2012 with the following data:

INSERT dbo.SourceTable (ID, Name)
VALUES 
(1, 'Matt'),
(1, 'Rocks'),
(2, 'Stylus'),
(3, 'Foo'),
(3, 'Bar'),
(3, 'Baz')

查询结果:

ID          FullName
----------- ------------------------------
2           Stylus
3           Bar, Baz, Foo
1           Matt, Rocks

相关文章