从列中获取子字符串并执行 groupBy 和 count

2022-01-22 00:00:00 pivot sql group-by sql-server substring

我有一个存储大量文件数据的表,例如它们的语言、唯一 ID、文件路径等.我希望能够从唯一 ID 中获取子字符串,它为我提供资产类型,这始终是 ID 的前 2 个字母.然后我想按语言对这些资产类型进行分组,并计算每种语言有多少种类型.所以最后我希望有一个表,它有一个语言列,然后是每个子字符串(资产类型)的列.

I have a table that stores data about a large number of files, such as their language, unique ID, file path etc. I want to be able to get the sub-string from the unique ID which gives me the asset type, this is always the first 2 letters of the ID. I then want to group these asset types by language and have a count for how many of each type every language has. So at the end I would ideally like a table that has a language column and then a column for each substring (asset type).

我试图创建一个大的 switch 语句,但这不是很可靠,有人告诉我也许 linq 会更好.我对 linq 或 sql 没有太多经验,我尝试过几个 sql 查询,这些查询让我获得了预期结果的一部分,但我希望也许有更多经验的人可能知道如何对这些功能进行分组成一个语句.

I have tried to create a large switch statement but this isn't very reliable and I was told maybe linq would be better. I don't have much experience with linq or sql and I have a couple of sql queries I've tried that gets me one part of the desired results, but I was hoping maybe someone who has more experience might know how to group these functions into one statement.

SELECT 
  LCID,
  SUBSTRING(AssetID,1,2)  
FROM [table]

这为我提供了正确的子字符串,但每种语言都有多行.有没有办法将相同的语言分组到一列中,然后计算每种类型的数量?谢谢

this gets me the correct substrings, but I have multiple rows for each language. Is there any way to group the same languages into one column and then count how many of each type there are? Thanks

推荐答案

听起来你想要一个 COUNT 和一个 GROUP BY:

Sounds like you want a COUNT and a GROUP BY:

SELECT 
  SUBSTRING(AssetID,1,2), 
  COUNT(*) Total
FROM [table]
GROUP BY SUBSTRING(AssetID,1,2)

您没有指定什么数据库,但是,如果您使用的是 SQL Server,并且 LCID 在您的 SELECT 语句中,那么您需要将它包含在您的 GROUP BY 子句.

You did not specify what database but, if you are using SQL Server and LCID is in your SELECT statement, then you will need to include it in your GROUP BY clause.

如果 LCID 值对于每一行都是唯一的,那么您将获得每个 AssetID 的多条记录,因为它会尝试将唯一值组合在一起.结果,我删除了 LCID.

If the LCID value is unique for each row then you will get multiple records for each AssetID because it will try to group the unique values together. As a result, I removed the LCID.

如果不是唯一的,那么你可以使用:

If it is not unique, then you can use:

SELECT LCID, 
  SUBSTRING(AssetID,1,2), 
  COUNT(*) Total
FROM [table]
GROUP BY LCID, SUBSTRING(AssetID,1,2)

根据您所做的编辑,您需要一个 PIVOT 将数据从行转换为列.对于 PIVOT,您将使用:

Based on the edits that you made, you want a PIVOT which transforms the data from rows into columns. For a PIVOT you will use:

select LCID, HA, HT, HP, FH, FX
from
(
  SELECT LCID, 
    SUBSTRING(AssetID,1,2) AssetID
  FROM [table]
) src
pivot
(
  count(AssetID)
  for AssetID in (HA, HT, HP, FH, FX) -- place more values here
) piv

如果要转换为列的值未知,则需要使用类似于此的动态 SQL:

If the values are unknown that you want to transform into columns, then you will need to use dynamic SQL similar to this:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(SUBSTRING(AssetID,1,2)) 
                    from [table]
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT LCID, ' + @cols + ' from 
             (
                SELECT LCID, 
                  SUBSTRING(AssetID,1,2) AssetID
                FROM [table]
            ) x
            pivot 
            (
                count(AssetID)
                for AssetID in (' + @cols + ')
            ) p '

execute(@query)

相关文章