查询几乎需要两秒钟,但只匹配两行 - 为什么索引没有帮助?
表:
CREATE TABLE `Alarms` (
`AlarmId` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`DeviceId` BINARY(16) NOT NULL,
`Code` BIGINT(20) UNSIGNED NOT NULL,
`Ended` TINYINT(1) NOT NULL DEFAULT '0',
`NaturalEnd` TINYINT(1) NOT NULL DEFAULT '0',
`Pinned` TINYINT(1) NOT NULL DEFAULT '0',
`Acknowledged` TINYINT(1) NOT NULL DEFAULT '0',
`StartedAt` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
`EndedAt` TIMESTAMP NULL DEFAULT NULL,
`MarkedForDeletion` TINYINT(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`AlarmId`),
KEY `Key1` (`Ended`,`Acknowledged`),
KEY `Key2` (`Pinned`),
KEY `Key3` (`DeviceId`,`Pinned`),
KEY `Key4` (`DeviceId`,`StartedAt`,`EndedAt`),
KEY `Key5` (`DeviceId`,`Ended`,`EndedAt`),
KEY `Key6` (`MarkedForDeletion`),
KEY `KeyB` (`MarkedForDeletion`,`DeviceId`,`StartedAt`,`EndedAt`,`Acknowledged`,`Pinned`)
) ENGINE=INNODB;
它目前有大约 300 万行.
It currently has about three million rows in it.
查询:
SELECT
COUNT(`AlarmId`) AS `n`
FROM `Alarms`
WHERE `StartedAt` < FROM_UNIXTIME(1519101900)
AND (`EndedAt` IS NULL OR `EndedAt` > FROM_UNIXTIME(1519101900))
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
AND `MarkedForDeletion` = FALSE
AND (
(`Alarms`.`EndedAt` IS NULL AND `Alarms`.`Acknowledged` = FALSE)
OR ( `Alarms`.`EndedAt` IS NOT NULL AND `Alarms`.`Pinned` = TRUE)
)
查询计划:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Alarms range Key2,Key3,Key4,Key5,Key6,KeyB KeyB 21 1574778 Using where; Using index
经过时间:1,763,222μs
Elapsed time: 1,763,222μs
在这种特殊情况下,查询(正确)甚至不匹配很多行(结果是 n = 2
).
In this particular case the query (correctly) doesn't even match many rows (the result is n = 2
).
利用我从使用索引合并中学到的东西(尽管我仍然没有做到这一点),我尝试重新组织一下条件(原始条件是由一些 C++ 生成的,基于输入条件,因此奇怪的运算符分布):
Taking what I learnt from working with index merges (though I still haven't got that right), I tried reorganising the conditions a bit (the original was generated by some C++, based on input conditions, hence the strange operator distribution):
SELECT COUNT(`AlarmId`) AS `n`
FROM `Alarms`
WHERE
(
`EndedAt` IS NULL
AND `Acknowledged` = FALSE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
) OR (
`EndedAt` > FROM_UNIXTIME(1519101900)
AND `Pinned` = TRUE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
);
…但是结果是一样的.
…but the result is the same.
那么为什么需要这么长时间?如何修改它/索引以使其立即工作?
So why does it take so long? How can I modify it / the indexes to make it work instantly?
推荐答案
OR
是出了名的难以优化.- MySQL 几乎从不在一个查询中使用两个索引.
OR
is notoriously hard to optimize.- MySQL almost never uses two indexes in a single query.
要避免这两种情况,请将 OR
转换为 UNION
.每个 SELECT
可以使用不同的索引.因此,为每个构建一个最佳 INDEX
.
To avoid both of those, turn OR
into UNION
. Each SELECT
can use its a different index. So, build an optimal INDEX
for each.
其实,既然你只是在做COUNT
,你不妨分别计算两个单独的计数并相加.
Actually, since you are only doing COUNT
, you may as well evaluate two separate counts and add them.
SELECT ( SELECT COUNT(*)
FROM `Alarms`
WHERE `EndedAt` IS NULL
AND `Acknowledged` = FALSE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000' )
) +
( SELECT COUNT(*)
FROM `Alarms`
WHERE `EndedAt` > FROM_UNIXTIME(1519101900)
AND `Pinned` = TRUE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
) AS `n`;
INDEX(DeviceId, Acknowledged, MarkedForDeletion, EndedAt, StartedAt) -- for first
INDEX(DeviceId, Pinned, MarkedForDeletion, EndedAt, StartedAt) -- for second
INDEX(DeviceId, Pinned, MarkedForDeletion, StartedAt, EndedAt) -- for second
好吧,如果有重叠,那就行不通了.那么,让我们回到 UNION
模式:
Well, that won't work if there is overlap. So, let's go back to the UNION
pattern:
SELECT COUNT(*) AS `n`
FROM
(
( SELECT AlarmId
FROM `Alarms`
WHERE `EndedAt` IS NULL
AND `Acknowledged` = FALSE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
)
UNION DISTINCT
( SELECT AlarmId
FROM `Alarms`
WHERE `EndedAt` > FROM_UNIXTIME(1519101900)
AND `Pinned` = TRUE
AND `StartedAt` < FROM_UNIXTIME(1519101900)
AND `MarkedForDeletion` = FALSE
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
)
);
再次添加这些索引.
每个 INDEX
中的前几列可以按任何顺序排列,因为它们是用 =
(或 IS NULL
)测试的.最后一两个是范围"测试.只有第一个范围将用于过滤,但我包含了另一列,以便索引将覆盖".
The first few columns in each INDEX
can be in any order, since they are tested with =
(or IS NULL
). The last one or two are "range" tests. Only the first range will be used for filtering, but I included the other column so that the index would be "covering".
我的公式可能比索引合并"更好.
My formulations may be better than "index merge".
相关文章