SQL 返回一组键列中非键列的唯一组合

2021-09-10 00:00:00 sql tsql sql-server

在 SQL Server 中,我有一个目标表 STAGNG_PA_BK_FEED_REVNU_SUM,它有以下 7 列构成其唯一索引.我想使用主要来自 STAGNG_PA_BK_FEED_REVNU_DTL 表的数据填充此表.目标表中的数据包含一个金额字段 REVNU_AMT,该字段通过对所有其他选定/非聚合列进行分组来聚合.

In SQL Server, I have a target table STAGNG_PA_BK_FEED_REVNU_SUM, that has the following 7 columns that make up its unique index. I want to populate this table using data primarily from the STAGNG_PA_BK_FEED_REVNU_DTL table. The data in the target table includes an amount field, REVNU_AMT, that is aggregated by grouping on all of the other selected/non aggregated columns.

因为我们正在选择添加属性"不包含在目标表的唯一键中的列,如果我们为键分组内的属性列获得多个不同的值组合,则插入将失败.发生这种情况时,我希望能够识别共享相同主键但具有不同属性值组合的所有源记录.换句话说,我希望能够生成完整源记录的报告,以便业务用户可以识别插入时导致唯一键冲突的违规记录.

Because we are selecting addition "property" columns that are not included in the unique key of the target table, the insert will fail if we get more than one distinct combination of values for the property columns within a key grouping. When this happens, I want to be able to identify the all of the source records that share the same primary key but have different property value combinations. In other words, I want to be able to produce a report of the complete source records so that the business users can identify the offending records that cause the unique key violation when inserting.

INSERT INTO dbo.STAGNG_PA_BK_FEED_REVNU_SUM
    (
    --Keys of target table
    RBT_YR_DT
  , MLR_SRC_SYS_CD
  , LGL_ENTTY_CD
  , CLIENT_ID
  , CLIENT_ACCT_NUM
  , BEN_PLAN_ID
  , CLIENT_CNTRCT_ST_CD

    --Properties of target table
  , MLR_EXTRT_SYS_CD
  , PA_LGL_ENTTY_CD
  , COA_CO_CD
  , COMMRCL_BUS_IND
  , COA_SITUS_ST_CD
  , ASGND_SITUS_STE_IND
  , CLIENT_TY_CD
  , MLR_SEG_CD
  
  --Fact
  
  , REVNU_AMT
    
    )
SELECT
    --values for keys of target table
    D.RBT_YR_DT
  , D.MLR_SRC_SYS_CD
  , D.LGL_ENTTY_CD
  , D.CLIENT_ID
  , D.CLIENT_ACCT_NUM
  , D.BEN_PLAN_ID
  , J.SITUS_STE_CD   AS CLIENT_CNTRCT_ST_CD

    --values for properties of target table
  , D.MLR_EXTRT_SYS_CD
  , D.PA_LGL_ENTTY_CD
  , D.COA_CO_CD
  , D.COMMRCL_BUS_IND
  , D.COA_SITUS_ST_CD
  , 'N'              AS ASGND_SITUS_STE_IND   
  , D.CLIENT_TY_CD
  , D.MLR_SEG_CD

  --Fact
  ,SUM(D.REVNU_AMT) AS REVNU_AMT
  
FROM
    dbo.STAGNG_PA_BK_FEED_REVNU_DTL D
        INNER JOIN JE_NT_STE_MAP J
            ON D.COA_SITUS_ST_CD = J.CONTRACT_SITUS_STATE
GROUP BY
    --PK
    D.RBT_YR_DT
  , D.MLR_SRC_SYS_CD
  , D.LGL_ENTTY_CD
  , D.CLIENT_ID
  , D.CLIENT_ACCT_NUM
  , D.BEN_PLAN_ID
  , J.SITUS_STE_CD

    --Properties
    -- Must be unique distinct group of value within the key grouping else key violation on target will result
  , D.MLR_EXTRT_SYS_CD
  , D.PA_LGL_ENTTY_CD
  , D.COA_CO_CD
  , D.COMMRCL_BUS_IND
  , D.COA_SITUS_ST_CD
  , D.CLIENT_TY_CD
  , D.MLR_SEG_CD

更新

我之前考虑过使用 Jaime 的方法将所有属性值连接成一个值,以便在 HAVING 子句中应用 DISTINCT 动词.

Update

I previously considered using Jaime's approach of concatenating all of the property values into a single value so the DISTINCT verb could be applied in the HAVING clause.

根据他的回答,我将其修改为显示报告源表中需要检查和更正的所有记录,以防止重复键违规,这是我的目标,而不仅仅是显示重复的非键价值.

Taking his answer, I modified it to display a report all of the records in the source table that need to be examined and corrected in order to prevent the duplicate key violation, which was my goal, not just display the duplicated non key values.

WITH DUPS AS
(
    SELECT
        --values for keys of target table
        D.RBT_YR_DT
      , D.MLR_SRC_SYS_CD
      , D.LGL_ENTTY_CD
      , D.CLIENT_ID
      , D.CLIENT_ACCT_NUM
      , D.BEN_PLAN_ID
      , J.SITUS_STE_CD              AS CLIENT_CNTRCT_ST_CD
    FROM
        dbo.STAGNG_PA_BK_FEED_REVNU_DTL D
            INNER JOIN JE_NT_STE_MAP J
                ON D.COA_SITUS_ST_CD = J.CONTRACT_SITUS_STATE
    GROUP BY
        --PK
        D.RBT_YR_DT
      , D.MLR_SRC_SYS_CD
      , D.LGL_ENTTY_CD
      , D.CLIENT_ID
      , D.CLIENT_ACCT_NUM
      , D.BEN_PLAN_ID
      , J.SITUS_STE_CD
    HAVING
        COUNT(DISTINCT
        CONVERT(VARCHAR(MAX), D.MLR_EXTRT_SYS_CD) + '-' +
        CONVERT(VARCHAR(MAX), D.PA_LGL_ENTTY_CD) + '-' +
        CONVERT(VARCHAR(MAX), D.COA_CO_CD) + '-' +
        CONVERT(VARCHAR(MAX), D.COMMRCL_BUS_IND) + '-' +
        CONVERT(VARCHAR(MAX), D.COA_SITUS_ST_CD) + '-' +
        CONVERT(VARCHAR(MAX), D.CLIENT_TY_CD) + '-' +
        CONVERT(VARCHAR(MAX), D.MLR_SEG_CD)
        ) > 1
)
SELECT
        --Keys
        D.RBT_YR_DT
      , D.MLR_SRC_SYS_CD
      , D.LGL_ENTTY_CD
      , D.CLIENT_ID
      , D.CLIENT_ACCT_NUM
      , D.BEN_PLAN_ID
      , D.COA_SITUS_ST_CD

      --Properties that have dups with a key groups
      , D.MLR_EXTRT_SYS_CD
      , D.PA_LGL_ENTTY_CD
      , D.COA_CO_CD
      , D.COMMRCL_BUS_IND
      , D.COA_SITUS_ST_CD
--      , 'N'              AS ASGND_SITUS_STE_IND   
      , D.CLIENT_TY_CD
      , D.MLR_SEG_CD

FROM
    STAGNG_PA_BK_FEED_REVNU_DTL D
        INNER JOIN JE_NT_STE_MAP J
                ON D.COA_SITUS_ST_CD = J.CONTRACT_SITUS_STATE
    inner join DUPS ON 

        DUPS.RBT_YR_DT           = D.RBT_YR_DT
    AND DUPS.MLR_SRC_SYS_CD      = D.MLR_SRC_SYS_CD
    AND DUPS.LGL_ENTTY_CD        = D.LGL_ENTTY_CD
    AND DUPS.CLIENT_ID           = D.CLIENT_ID
    AND DUPS.CLIENT_ACCT_NUM     = D.CLIENT_ACCT_NUM
    AND DUPS.BEN_PLAN_ID         = D.BEN_PLAN_ID
    AND DUPS.CLIENT_CNTRCT_ST_CD = J.SITUS_STE_CD

order by 1,2,3,4,5,6,7,8,9,10,11,12,13,14

虽然这可能在实践中起作用,但我发现它有点笨拙,最后我可能会显示具有重复的非键的记录,该非键出现了另一个键组但只出现过一次,所以最终报告可能不准确.

While this may work in practice I find it a little kludgy and in the end I may be displaying records that had a duplicate non-key that appeared another key grouping but only appeared there once, so the end report may not be accurate.

推荐答案

这样的事情可以帮助您找到具有 1 个以上不同属性组合的 PK

Something like this could help you to find those PK with more than 1 different properties combination

SELECT
   --values for keys of target table
   D.RBT_YR_DT
  ,D.MLR_SRC_SYS_CD
  ,D.LGL_ENTTY_CD
  ,D.CLIENT_ID
  ,D.CLIENT_ACCT_NUM
  ,D.BEN_PLAN_ID
  ,J.SITUS_STE_CD   AS CLIENT_CNTRCT_ST_CD
FROM dbo.STAGNG_PA_BK_FEED_REVNU_DTL D
  INNER JOIN JE_NT_STE_MAP J
    ON D.COA_SITUS_ST_CD = J.CONTRACT_SITUS_STATE
GROUP BY
   --PK
   D.RBT_YR_DT
  ,D.MLR_SRC_SYS_CD
  ,D.LGL_ENTTY_CD
  ,D.CLIENT_ID
  ,D.CLIENT_ACCT_NUM
  ,D.BEN_PLAN_ID
  ,J.SITUS_STE_CD
HAVING
    COUNT(DISTINCT
      CONVERT(VARCHAR(MAX),D.MLR_EXTRT_SYS_CD) + '-' +
      CONVERT(VARCHAR(MAX),D.PA_LGL_ENTTY_CD) + '-' +
      CONVERT(VARCHAR(MAX),D.COA_CO_CD) + '-' +
      CONVERT(VARCHAR(MAX),D.COMMRCL_BUS_IND) + '-' +
      CONVERT(VARCHAR(MAX),D.COA_SITUS_ST_CD) + '-' +
      CONVERT(VARCHAR(MAX),'N' AS ASGND_SITUS_STE_IND) + '-' +
      CONVERT(VARCHAR(MAX),D.CLIENT_TY_CD) + '-' +
      CONVERT(VARCHAR(MAX),D.MLR_SEG_CD)
    ) > 1

相关文章