在 SQL Server 中将行数据转换为列
今天我的同事要求我将数据从垂直暂存表转换为水平表.我的意思是将行转换为列.我使用 PIVOT 并解决了它.但是如果数据字段重复出现,我会遇到移动数据的麻烦.
Today I was asked by my colleague to transform data from a vertical staging table into a horizontal table. I mean transform rows to column. I used PIVOT and resolved it. But got into situation where I am getting trouble to move data if the data field repeats itself.
这是我正在处理的测试数据:
Here is the test data that I am working on:
CREATE TABLE STAGING
(
ENTITYID INT,
PROPERTYNAME VARCHAR(25),
PROPERTYVALUE VARCHAR(25)
)
INSERT INTO STAGING VALUES (1, 'NAME', 'DONNA')
INSERT INTO STAGING VALUES (1, 'SPOUSE', 'HENRY')
INSERT INTO STAGING VALUES (1, 'CHILD', 'JACK')
INSERT INTO STAGING VALUES (2, 'CHILD', 'KAYALA')
我使用 PIVOT 将行数据显示为列:
I used PIVOT to show row data as columns:
SELECT * FROM
(SELECT ENTITYID, PROPERTYNAME, PROPERTYVALUE FROM STAGING) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN (NAME, SPOUSE, CHILD)) AS T2
输出是:
ENTITYID NAME SPOUSE CHILD
1 DONNA HENRY JACK
2 NULL NULL KAYALA
但他希望输出类似于:
ENTITYID NAME SPOUSE CHILD CHILD
1 DONNA HENRY JACK KAYALA
底线是可以有多个 CHILD 属性进入临时表.我们需要考虑这一点,并将所有 CHILDREN 移到列中.
bottom line is that there can be more than one CHILD attribute coming into the staging table. And we need to consider this and move all the CHILDREN to columns.
这可能吗?
推荐答案
您可以在属性名中添加一个行号,这样您就可以为所欲为:
You can add a row number to the propertyname that will allow you to do what you want:
SELECT * FROM
(
SELECT ENTITYID
, PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5))
,PROPERTYVALUE
FROM #STAGING
) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN (NAME1, SPOUSE1, CHILD1, CHILD2, CHILD3, CHILD4, CHILD5)) AS T2
我在这里假设 ENTITYID 将孩子与父母联系起来,即同一个人的所有孩子的 ENTITYID 为 1,但您的示例显示 Kayala 为 2.
I'm assuming here that the ENTITYID ties the children to the parent, ie all children for the same person have ENTITYID of 1, but your example shows a 2 for Kayala.
这是一个演示:SQL Fiddle
如果你只想要 CHILD 字段的数字,你可以这样写:
If you only want the numbers for the CHILD fields you could put this:
PROPERTYNAME = CASE WHEN PROPERTYNAME LIKE '%CHILD%' THEN PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5)) ELSE PROPERTYNAME END
然后从您的 IN() 语句的其他字段中删除该数字.
Then remove the number from the other fields in your IN() statement.
额外问题 - 动态执行上述操作:我们不想假设人们只有一个配偶或 2.3 个孩子,所以我们动态地做这件事:
Bonus Question- Do the above dynamically: We don't want to assume that people only have one spouse or 2.3 children, so we do the whole bit dynamically:
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF((SELECT ',' + PROPERTYNAME
FROM (SELECT DISTINCT PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5))
FROM STAGING )sub
ORDER BY CASE WHEN PROPERTYNAME LIKE '%NAME%' THEN 1
WHEN PROPERTYNAME LIKE '%SPOUSE%' THEN 2
WHEN PROPERTYNAME LIKE '%CHILD%' THEN 3
ELSE 4
END
,RIGHT(PROPERTYNAME,1)
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
SET @query = 'SELECT * FROM
(
SELECT ENTITYID, PROPERTYNAME = PROPERTYNAME + CAST(ROW_NUMBER() OVER(PARTITION BY ENTITYID, PROPERTYNAME ORDER BY PROPERTYVALUE) AS VARCHAR(5)),PROPERTYVALUE
FROM STAGING
) AS T
PIVOT (MAX(PROPERTYVALUE) FOR PROPERTYNAME IN ('+@cols+')) AS T2
'
EXEC(@query)
注意:顺序只适用于配偶 1-9 和孩子 1-9,您可以调整以适应,但无论如何都是任意的.
Note: The ordering will only work for spouses 1-9 and children 1-9, you can adjust that to suit, but it's arbitrary anyway.
相关文章