为什么 STRAIGHT_JOIN 对这个查询有如此大的改进,当它写在 SELECT 关键字之后是什么意思?
我有以下 MySql 查询:
选择 t1.*从表 1 t1内连接 Table2 t2在 t1.CommonID = t2.CommonID其中 t1.FilterID = 1
运行大约需要 30 秒,这很奇怪,因为如果我注释掉 join 或 where 子句,它需要不到一秒钟的时间:即
选择 t1.*从表 1 t1其中 t1.FilterID = 1
或
选择 t1.*从表 1 t1内连接 Table2 t2在 t1.CommonID = t2.CommonID
每个都需要不到一秒钟.
然后是 STRAIGHT_JOIN 关键字,我可以在这里找到一个参考:http://dev.mysql.com/doc/refman/5.0/en/join.html
<块引用>STRAIGHT_JOIN 类似于 JOIN,除了左表总是在右表之前阅读.这个可以用于那些(少数)情况连接优化器将表格顺序错误.
什么?我可以写:
选择 t1.*从表 1 t1STRAIGHT_JOIN Table2 t2在 t1.CommonID = t2.CommonID其中 t1.FilterID = 1
查询在不到一秒的时间内执行.
即使是陌生人,我也能写:
选择 STRAIGHT_JOIN t1.*从表 1 t1内连接 Table2 t2在 t1.CommonID = t2.CommonID其中 t1.FilterID = 1
而且只需要不到一秒钟的时间,而且这种语法似乎也不合法.
我猜第二个例子意味着在编写 INNER JOIN 时将使用 STRAIGHT_JOIN,但我找不到任何关于它的文档.
这是怎么回事,join优化器"怎么会导致如此相对较差的性能?我应该总是使用 STRAIGHT_JOIN 吗?我怎么知道什么时候使用它?
Table1 和 Table2 都有整数主键;FilterID 是另一个表的外键;CommonID 列都是第三个表的外键.它们都有索引.数据库引擎是 InnoDB.
谢谢
解决方案这是怎么回事,连接优化器"怎么会导致如此相对较差的性能?
STRAIGHT_JOIN
强制表的连接顺序,因此在外循环中扫描 table1
,在内循环中扫描 table2
.>
优化器并不完美(虽然仍然相当不错),最可能的原因是过时的统计数据.
<块引用>我应该总是使用 STRAIGHT_JOIN
不,只有当优化器出错时.这可能是因为您的数据分布严重倾斜或无法正确计算(例如,对于空间索引或全文索引).
<块引用>如何判断何时使用或不使用?
您应该收集统计数据,为两种方式制定计划并了解这些计划的含义.
如果你看到:
自动生成的计划不是最优的,无法通过标准方式改进,
STRAIGHT_JOIN
版本更好,你理解它总是会并且明白为什么它总是会
,然后使用 STRAIGHT_JOIN
.
I have the following MySql query:
select t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1
It takes about 30 seconds to run, which was strange, because if I comment out the join or the where clause it takes less than a second: i.e.
select t1.*
from Table1 t1
where t1.FilterID = 1
or
select t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID
each take less than a second.
Then there is the STRAIGHT_JOIN keyword, which I can find one reference of, here: http://dev.mysql.com/doc/refman/5.0/en/join.html
STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.
What? I can write:
select t1.*
from Table1 t1
STRAIGHT_JOIN Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1
and the query executes in less than a second.
Even stranger, I can write:
select STRAIGHT_JOIN t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1
and it takes less than a second, and this syntax does not appear to even be legal.
I would guess the second example means that a STRAIGHT_JOIN will be used whenever an INNER JOIN is written, but I can’t find any documentation about it.
What is going on here, and how can the "join optimizer" result in such relatively poor performance? Should I always use STRAIGHT_JOIN? How can I tell when to use it or not?
Table1 and Table2 both have integer primary keys; FilterID is a foreign key to another table; the CommonID columns are both foreign keys to a third table. They both have indexes on them. The database engine is InnoDB.
Thanks
解决方案What is going on here, and how can the "join optimizer" result in such relatively poor performance?
STRAIGHT_JOIN
forces the join order of the tables, so table1
is scanned in the outer loop and table2
in the inner loop.
The optimizer is not perfect (though stil quite decent), and the most probable cause is the outdated statistics.
Should I always use
STRAIGHT_JOIN
No, only when the optimizer is wrong. This may be if your data distribution is severely skewed or cannot be calculated properly (say, for spatial or fulltext indexes).
How can I tell when to use it or not?
You should collect the statistics, build the plans for both ways and understand what do these plans mean.
If you see that:
The automatically generated plan is not optimal and cannot be improved by the standard ways,
The
STRAIGHT_JOIN
version is better, you understand it always will and understand why it always will
, then use STRAIGHT_JOIN
.
相关文章