TSQL:在 select with join 中使用替换函数
背景.我正在使用 SQL Server.我在数据库中有两个表:
Background. I'm using SQL Server. I have two tables in database:
Vendors(Id, Name, Description)
Products(Id, VendorId, Name, Description)
Id
列中的值使用 Vendor
表中的前缀 'ID_'
进行格式化.
Values in Id
column are formatted with prefix 'ID_'
in Vendor
table.
VendorId
列中的值使用 Products
表中的前缀 'VE_'
进行格式化.
Values in VendorId
column are formatted with prefix 'VE_'
in Products
table.
例如 Products
中的'VE_001245'
是指Vendors
中的'ID_001245'
>.
(请不要提议改变这个概念,不关心数据库方案,不建议添加外键.只是为了说明.)
(Please, do not propose to change this concept, do not care about database scheme, do not suggest adding foreign key. All it is just for illustration.)
问题:以下哪个查询在性能方面最好,为什么?
Question: which one of following queries is best in performance context and why?
在内部
select
中使用replace
函数:
select v.* from Vendors v
inner join
(
select distinct replace(VendorId, 'VE_', 'ID_') as Id
from Products
) list
on v.Id = list.Id
在on
语句中使用replace
函数:
select v.* from Vendors v
inner join
(
select distinct VendorId as Id
from Products
) list
on v.Id = replace(list.Id, 'VE_', 'ID_')
编辑.每个表中只有聚集索引(按Id
列).每个表可以包含数百万行.
Edit. There is only clustered index in each table (by Id
column). Each table can contains millions rows.
推荐答案
两个查询在性能方面几乎相同.在第一个查询中,排序 进行了两次,一次是在选择不同记录时,一次是在执行内部联接时,最后是 合并联接选择最终结果集.而在第二个查询中,排序只完成一次,但正在执行 Hash join,这比合并连接更昂贵.因此,在表上没有任何索引的情况下,这两个查询在性能方面是相同的.
Both the queries are almost same in terms of performance. In the first query sorting is done twice, once when you are selecting the distinct records and again when it is performing an inner join, and in the end a merge join is there to select the final result set. Whereas in second query sorting is done only once but Hash join is being performed which is more expensive then merge join. So both the queries are same performance wise in the scenario when you don't have any index on the table.
相关文章