SQL - 联合两个表,每个表都有几个唯一的列

2021-09-10 00:00:00 sql tsql sql-server-2008 sql-server

有两组数据(两张表)用于患者记录,一组是 1999-2003,另一组是 2004-2009.每个都有 >100 列;Table_A 有 ~8 个独特的列,Table_B ~ 25 个独特的列(相互比较).我的目标是:

Have two sets of data (two tables) for patient records, one 1999-2003, the other 2004-2009. Each has >100 columns; Table_A has ~8 unique columns, Table_B ~ 25 unique columns (Compared to each other). My goal is:

  1. 包含 1999-2009 年所有数据的单个表格
  2. 对于一个表中不在另一个表中的行,只需将该列的值设为 NULL.例如如果表A有Diagnostic_Category_12而Table_B没有,则该值将是表A中的原始值,而表B中为NULL

我见过一种手动执行此操作的方法:合并具有不同列数的两个表

I've seen a method for doing this manually: Unioning Two Tables With Different Number Of Columns

但是,此数据集中的列太多,无法逐一输入 - 我只想自动创建列并根据需要插入 NULL 值.

However there are far too many columns in this data set to type each one in - I'd like to just auto-create columns and insert NULL values as needed.

我使用的是 SQL Server 2008R2.

I am using SQL Server 2008R2.

推荐答案

更聪明地工作,而不是更努力.

Work smarter, not harder.

我建议您通过查询您的架构来构建一些 SQL...这样您就不会因为手工编写而遗漏任何东西.您可以像这样生成脚本(只需将 @tableName1@tableName2 值替换为适当的表名):

I'd recommend that you build up some SQL by querying your schema... this way you don't miss anything by writing things by hand. You can generate the script like so (just replace @tableName1 and @tableName2 values with the appropriate table names):

declare
 @tableName1 sysname = 'myfirsttablename'
,@tableName2 sysname = 'mysecondtablename'
,@select varchar(max) = 'select';

declare @columns table
(
     Id int identity(1,1)
    ,ColumName nvarchar(128)
    ,ExistsInTable1 bit
    ,ExistsInTable2 bit
);

-- Get a column listing with flags for their existence in each table
insert @columns
select distinct
 quotename(c.Column_Name)
,iif(c2.Table_Name is null, 0, 1)
,iif(c3.Table_Name is null, 0, 1)
from Information_Schema.Columns as c
    left join Information_Schema.Columns as c2
    on c2.Column_Name = c.Column_Name
    and c2.Table_Name = @tableName1
    left join Information_Schema.Columns as c3
    on c3.Column_Name = c.Column_Name
    and c3.Table_Name = @tableName2 
where c.Table_Name in (@tableName1, @tableName2);

-- Build the select statement for the 1sttable (using null where the column is absent)
select
 @select += char(10) + iif(c.Id = 1, ' ', ',') 
+ iif(c.ExistsInTable1 = 1, c.ColumName, 'null') + ' as ' + c.ColumName
from @columns as c
order by c.Id;

set @select += '
from ' + quotename(@tableName1) + '
union all
select';

-- Build the select statement for the 2ndtable (using null where the column is absent)
select
 @select += char(10) + iif(c.Id = 1, ' ', ',') 
+ iif(c.ExistsInTable2 = 1, c.ColumName, 'null') + ' as ' + c.ColumName
from @columns as c
order by c.Id;

set @select += '
from ' + quotename(@tableName2);

-- Print or execute your sql.
print(@select); -- or exec(@select);

生成 SQL 后,我建议您:

Once you've generated your SQL, I'd recommend that you:

  1. 验证您的结果并根据需要调整您的查询.
  2. 将最终的 SQL 放在存储过程中,而不是为每个请求即时生成它.

相关文章