我什么时候应该使用 CROSS APPLY 而不是 INNER JOIN?
使用CROSS APPLY的主要目的是什么?>
我已经读到(模糊地,通过 Internet 上的帖子)cross apply
在选择大型数据集时会更有效,如果您进行分区.(想到分页)
我也知道 CROSS APPLY
不需要 UDF 作为右表.
在大多数 INNER JOIN
查询(一对多关系)中,我可以重写它们以使用 CROSS APPLY
,但它们总是给我等效的执行计划.
谁能给我一个很好的例子,说明 CROSS APPLY
在哪些情况下 INNER JOIN
也能起作用?
这是一个简单的例子,其中的执行计划完全相同.(告诉我它们的不同之处以及 cross apply
更快/更有效的地方)
创建表公司(companyId int 身份(1,1), 公司名称 varchar(100), 邮政编码 varchar(10), 约束 PK_Company 主键 (companyId))走创建表人(personId int 身份(1,1), personName varchar(100), companyId int, 约束 FK_Person_CompanyId 外键 (companyId) 引用 dbo.Company(companyId), 约束PK_Person主键(personId))走插入公司选择ABC 公司"、19808"工会选择XYZ 公司"、08534"工会选择123 公司"、10016"插入人选择艾伦",1 个联合选择鲍比",1 个工会选择克里斯",1 个工会选择 'Xavier', 2 union选择 'Yoshi', 2 union选择 'Zambrano', 2 union选择玩家 1"、3 联盟选择'玩家 2', 3 联盟选择'玩家 3', 3/* 使用交叉应用 */选择 *来自人 p交叉申请(选择 *来自 c 公司其中 p.companyid = c.companyId) 压缩包/* 使用 INNER JOIN 的等效查询 */选择 *来自人 pp.companyid = c.companyId 上的内部连接公司 c
解决方案 任何人都可以举一个很好的例子来说明当 CROSS APPLY 在那些 INNER JOIN 也可以工作的情况下有所作为时吗?
详细性能对比见我博客文章:
内连接
与CROSS APPLY
CROSS APPLY
在没有简单 JOIN
条件的情况下效果更好.
这个从t2
中为t1
中的每条记录选择3
最后一条记录:
SELECT t1.*, t2o.*从 t1交叉申请(选择前 3 个 *从 t2哪里 t2.t1_id = t1.id订购者t2.rank DESC) t2o
它不能用 INNER JOIN
条件轻松表述.
您可能可以使用 CTE
和窗口函数来做类似的事情:
with t2o AS(SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn从 t2)选择 t1.*, t2o.*从 t1内部联接t2oON t2o.t1_id = t1.idAND t2o.rn <= 3
,但这样可读性较差,效率也可能较低.
更新:
刚刚检查过.
master
是一个包含 20,000,000
条记录的表格,PRIMARY KEY
位于 id
.
这个查询:
WITH q AS(SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn从主),吨AS(选择 1 作为 ID联合所有选择 2)选择 *从T加入qON q.rn <= t.id
运行几乎 30
秒,而这个:
WITH t AS(选择 1 作为 ID联合所有选择 2)选择 *从T交叉申请(SELECT TOP (t.id) m.*发件人订购者ID) q
是即时的.
What is the main purpose of using CROSS APPLY?
I have read (vaguely, through posts on the Internet) that cross apply
can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)
I also know that CROSS APPLY
doesn't require a UDF as the right-table.
In most INNER JOIN
queries (one-to-many relationships), I could rewrite them to use CROSS APPLY
, but they always give me equivalent execution plans.
Can anyone give me a good example of when CROSS APPLY
makes a difference in those cases where INNER JOIN
will work as well?
Edit:
Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply
is faster/more efficient)
create table Company (
companyId int identity(1,1)
, companyName varchar(100)
, zipcode varchar(10)
, constraint PK_Company primary key (companyId)
)
GO
create table Person (
personId int identity(1,1)
, personName varchar(100)
, companyId int
, constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
, constraint PK_Person primary key (personId)
)
GO
insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'
insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union
select 'Player 3', 3
/* using CROSS APPLY */
select *
from Person p
cross apply (
select *
from Company c
where p.companyid = c.companyId
) Czip
/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId
解决方案
Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?
See the article in my blog for detailed performance comparison:
INNER JOIN
vs.CROSS APPLY
CROSS APPLY
works better on things that have no simple JOIN
condition.
This one selects 3
last records from t2
for each record from t1
:
SELECT t1.*, t2o.*
FROM t1
CROSS APPLY
(
SELECT TOP 3 *
FROM t2
WHERE t2.t1_id = t1.id
ORDER BY
t2.rank DESC
) t2o
It cannot be easily formulated with an INNER JOIN
condition.
You could probably do something like that using CTE
's and window function:
WITH t2o AS
(
SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
FROM t2
)
SELECT t1.*, t2o.*
FROM t1
INNER JOIN
t2o
ON t2o.t1_id = t1.id
AND t2o.rn <= 3
, but this is less readable and probably less efficient.
Update:
Just checked.
master
is a table of about 20,000,000
records with a PRIMARY KEY
on id
.
This query:
WITH q AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
),
t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
JOIN q
ON q.rn <= t.id
runs for almost 30
seconds, while this one:
WITH t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
CROSS APPLY
(
SELECT TOP (t.id) m.*
FROM master m
ORDER BY
id
) q
is instant.
相关文章