是否可以使用单个 UPDATE SQL 语句执行多个更新?

假设我有一个表 tbl,其中包含 id 和 title 列.我需要更改标题列的所有值:

Let's say I have a table tbl with columns id and title. I need to change all values of title column:

  1. 从a-1"到a1",
  2. 从a.1"到a1",
  3. 从b-1"到b1",
  4. 从b.1"到b1".

现在,我正在执行两个 UPDATE 语句:

Right now, I'm performing two UPDATE statements:

UPDATE tbl SET title='a1' WHERE title IN ('a-1', 'a.1')
UPDATE tbl SET title='b1' WHERE title IN ('b-1', 'b.1')

这根本不是问题,如果表很小,并且单个语句在不到一秒的时间内完成,您只需要执行几条语句.

This isn't at all a problem, if the table is small, and the single statement completes in less than a second and you only need a few statements to execute.

您可能是客套话了 - 我有一个巨大的表要处理(一个语句在大约 90 秒内完成),而且我有大量的更新要执行.

You probably guested it - I have a huge table to deal with (one statement completes in about 90 seconds), and I have a huge number of updates to perform.

那么,是否可以合并更新以便它只扫描表一次?或者,在这种情况下,有更好的处理方法.

So, is it possible to merge the updates so it would only scan the table once? Or perhaps, there's a better way to deal with in a situation like this.

请注意,我正在使用的真实数据以及我必须执行的数据更改并不是那么简单 - 字符串更长并且它们不遵循任何模式(它是用户数据,所以不能做任何假设 - 它可以是任何东西).

Note, that the real data I'm working with and the changes to the data I have to perform are not really that simple - the strings are longer and they don't follow any pattern (it is user data, so no assumptions can be made - it can be anything).

推荐答案

在更一般的情况下,可能有数百个映射到每个新值,您将创建一个单独的旧值和新值表,然后在 UPDATE 语句中使用它.在 SQL 的一种方言中:

In a more general case, where there could be many hundreds of mappings to each of the new values, you would create a separate table of the old and new values, and then use that in the UPDATE statement. In one dialect of SQL:

CREATE TEMP TABLE mapper (old_val CHAR(5) NOT NULL, new_val CHAR(5) NOT NULL);
...multiple inserts into mapper...
INSERT INTO mapper(old_val, new_val) VALUES('a.1', 'a1');
INSERT INTO mapper(old_val, new_val) VALUES('a-1', 'a1');
INSERT INTO mapper(old_val, new_val) VALUES('b.1', 'b1');
INSERT INTO mapper(old_val, new_val) VALUES('b-1', 'b1');
...etcetera...

UPDATE tbl
   SET title = (SELECT new_val FROM mapper WHERE old_val = tbl.title)
   WHERE title IN (SELECT old_val FROM mapper);

两个选择语句都至关重要.第一个是相关子查询(不一定快,但如果映射表有数千行,则比大多数替代查询快),它将新值从对应于旧值的映射表中拉出.第二个确保只有那些在映射表中有值的行被修改;这是至关重要的,否则,对于那些没有映射条目的行(以及那些在您开始之前正常的记录),标题将设置为 null.

Both select statements are crucial. The first is a correlated sub-query (not necessarily fast, but faster than most of the alternatives if the mapper table has thousands of rows) that pulls the new value out of the mapping table that corresponds to the old value. The second ensures that only those rows which have a value in the mapping table are modified; this is crucial as otherwise, the title will be set to null for those rows without a mapping entry (and those were the records that were OK before you started out).

对于一些替代方案,CASE 操作是可以的.但是,如果您有数百、数千或数百万个映射要执行,那么您很可能会超出 DBMS 中 SQL 语句长度的限制.

For a few alternatives, the CASE operations are OK. But if you have hundreds or thousands or millions of mappings to perform, then you are likely to exceed the limits of the SQL statement length in your DBMS.

相关文章