INSERT INTO SELECT 使用 UNION 的奇怪顺序

2021-09-10 00:00:00 union tsql sql-server-2008 sql-server

我有一个典型的非规范化表 (tempTable),其中包含多个编号的列 (rep1,rep2,...).所以我写了一个脚本将非规范化数据插入规范化表 (myTable):

I had a typical non-normalized table (tempTable) with multiple numbered columns (rep1,rep2,...). So i wrote a script to insert the non-normalized data into a normalized table (myTable):

insert into myTable
select idRep,rep FROM
(
    select idRep, ISNULL(rep1,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep2,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep3,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep4,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep5,'') as rep FROM tempTable
) as t

注意:表 myTable 还包含一个自动递增的 IDENTITY 列作为它的 PRIMARY KEY.

Note: The table myTable also contains an auto-incremented IDENTITY column as its PRIMARY KEY.

在我的场景中,rep1、rep2、rep3、rep4、rep5 的顺序很重要.奇怪的是,当我执行脚本时,数据没有以正确的顺序插入,例如自动生成的 id '1000' 的值来自 'rep3',而 id '1001' 的值来自 'rep1'.

The order rep1, rep2, rep3, rep4, rep5 is important in my scenario. Strangely, when I executed the script, the data wasn't inserted in the correct order such as the auto-generated id '1000' had the value from 'rep3' and the id '1001' had the value from 'rep1'.

这是为什么?脚本是如何执行的?

Why is that? How was the script executed?

推荐答案

在使用 UNION 时它没有按照您期望的顺序进行的原因是 union 试图强加唯一性,因此它正在处理所有这些行并带来按照对引擎最方便的顺序排列.

The reason it is not going in the order you expect when using UNION is that union attempts to impose uniquness, so it is processing all of those rows together and bringing them out in the order most convenient for the engine.

如果您像 Parado 建议的那样切换到 UNION ALL(它不会尝试强加唯一性),它将不会进行处理,并且它们将按照您放入的顺序进入表,几乎每时每刻.然而,这并不是绝对的,其他进程中发生的某些非常不寻常的情况(尤其是那些以某种方式触及您的临时表的情况)可能会影响它.

If you switch to UNION ALL (which does not try to impose uniqueness) as Parado suggested it will not do the processing and they will go into the table in the order you put them in, almost all the time. This however is not gaurunteed and certain very unusual circumstances going on in other processes (especially ones that somehow touch on your tempTable) can affect it.

如果您按照 Kash 的建议使用 order by,那么这将保证 id 的顺序(这可能很重要),但从技术上讲,不会确定插入行的顺序(这在实践中很少有影响).

If you use an order by as Kash suggests then that will gauruntee the order of the ids (which can matter), but not technically the order that the rows get inserted (which very rarely matters in practice).

对其中的一些内容进行了很好的总结MSDN.

所以,这就说明了原因.至于如何获得您真正想要的东西,我会使用 Kash 的建议,即添加一列与 order by 子句一起使用,但我会使用 UNION ALL 而不是 UNION.使用 UNION 就像添加和隐含的distinct"要求一样,会占用处理器周期并使查询计划更加复杂.

So, that takes care of the why. As for the how to get what you actually want, I would use Kash's suggestion of adding a column to use with an order by clause, but I would use UNION ALL instead of UNION. Using UNION is like adding and implicit "distinct" requirement, which takes up processor cycles and makes the query plan more complicated.

相关文章