T-SQL“合并"两行,或“Rekey"所有 FK 关系

2021-09-10 00:00:00 sql tsql sql-server

我有一个生产数据库,偶尔需要合并"单个表中的冗余行.

I have a production database where occasionally redundant rows in a single table need to be "Merged".

假设此表中的两行具有相同的值,除了它们的 ID.

Let's assume that both rows in this table have identical values, except their IDs.

Table "PrimaryStuff"
ID | SomeValue
1  | "I have value"
2  | "I have value"
3  | "I am different"

我们还假设存在多个相关表.因为在PrimaryStuff"表中创建了重复项,所以通常在这些子表中创建的行都应该与 PrimaryStuff 表上的单个记录相关.这些表的数量和名称不在我的控制之下,应该在运行时动态考虑. IE:我不知道名称甚至相关记录的数量,因为其他人可能会编辑数据库在我不知情的情况下.

Let's also assume that a number of related tables exist. Because duplicates were created in the "PrimaryStuff" table, often rows are created in these child tables that SHOULD all be related to a single record on the PrimaryStuff table. The number and names of these tables are not under my control and should be considered dynamically at runtime. IE: I don't know the names or even the number of related records, as other people may edit the database without my knowledge.

Table "ForeignStuff"
ID | PrimaryStuffId | LocalValue
1| 1| "I have the correct FK"
2| 1| "I have the correct FK"
3| 2| "I should get pointed to an FK of 1"

为了解决 PrimaryStuff 的第 1 行和第 2 行的重复问题,我希望所有相关的表都将它们的 FK 更改为 1,然后删除 PrimaryStuff 的第 2 行.这应该是微不足道的,就好像 PrimaryStuff 的行一样1 不存在,我只需将第 2 行的主键更新为 1,更改就会级联出来.我不能这样做,因为这将是 PrimaryStuff 的唯一索引中的重复键.

To resolve the duplication of PrimaryStuff's row 1 and 2, I wish to have ALL related tables change their FK's to 1s and then delete the PrimaryStuff's row 2. This SHOULD be trivial, as if PrimaryStuff's row 1 didn't exist, I could just update the Primary Key on Row 2 to 1, and the changes would cascade out. I cannot do this because that would be a duplicate key in the PrimaryStuff's unique index.

请随时提出问题,我会尽力解决任何令人困惑的问题.

Feel free to ask questions and I'll try to clear up anything that's confusing.

推荐答案

首先让我们获取需要更新的行列表(据我所知,您希望最低 ID 替换所有较高 ID)

First lets get a list of the rows that need to be updated (as I understand it you want the lowest ID to replace all the higher IDs)

 SELECT MIN(ID) OVER (PARTITION BY SomeValue ORDER BY SomeValue, ID ASC) AS FirstID,
        ID,
        SOMEVALUE
 FROM PrimaryStuff

我们可以去掉 FirstID 和 ID 匹配的那些,这些都没有关系

We can remove the ones where FirstID and ID match, these don't matter

SELECT FirstID, ID FROM
(
 SELECT MIN(ID) OVER (PARTITION BY SomeValue ORDER BY SomeValue, ID ASC) AS FirstID,
        ID,
        SOMEVALUE
 FROM PrimaryStuff
) T
WHERE FirstID != ID

现在我们有一个更改列表.我们可以在更新语句中使用它,将其放入临时表(或我在下面所做的 CTE)中:

Now we have a change list. We can use this in an update statement, put it in a temp table (or a CTE as I did below):

WITH ChangeList AS
(
  SELECT FirstID, ID FROM
  (
   SELECT MIN(ID) OVER (PARTITION BY SomeValue ORDER BY SomeValue, ID ASC) AS FirstID,
        ID
   FROM PrimaryStuff
  ) T
  WHERE FirstID != ID
)
UPDATE ForeignStuff
SET PrimaryStuffId = ChangeList.FirstID
FROM ForeignStuff
JOIN ChangeList ON ForeignStuff.ID = ChangeList.ID

注意 - 代码未经测试,可能有错别字.

NB - Code not tested, might have typos.

相关文章