查询几乎需要两秒钟，但只匹配两行 - 为什么索引没有帮助?

2022-01-15 00:00:00 indexing mariadb mysql

表:

CREATE TABLE `Alarms` (
  `AlarmId` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `DeviceId` BINARY(16) NOT NULL,
  `Code` BIGINT(20) UNSIGNED NOT NULL,
  `Ended` TINYINT(1) NOT NULL DEFAULT '0',
  `NaturalEnd` TINYINT(1) NOT NULL DEFAULT '0',
  `Pinned` TINYINT(1) NOT NULL DEFAULT '0',
  `Acknowledged` TINYINT(1) NOT NULL DEFAULT '0',
  `StartedAt` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
  `EndedAt` TIMESTAMP NULL DEFAULT NULL,
  `MarkedForDeletion` TINYINT(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`AlarmId`),
  KEY `Key1` (`Ended`,`Acknowledged`),
  KEY `Key2` (`Pinned`),
  KEY `Key3` (`DeviceId`,`Pinned`),
  KEY `Key4` (`DeviceId`,`StartedAt`,`EndedAt`),
  KEY `Key5` (`DeviceId`,`Ended`,`EndedAt`),
  KEY `Key6` (`MarkedForDeletion`),

  KEY `KeyB` (`MarkedForDeletion`,`DeviceId`,`StartedAt`,`EndedAt`,`Acknowledged`,`Pinned`)
) ENGINE=INNODB;

它目前有大约 300 万行.

It currently has about three million rows in it.

查询:

SELECT COUNT(`AlarmId`) AS `n` FROM `Alarms` WHERE `StartedAt` < FROM_UNIXTIME(1519101900) AND (`EndedAt` IS NULL OR `EndedAt` > FROM_UNIXTIME(1519101900)) AND `DeviceId` = UNHEX('00030000000000000000000000000000') AND `MarkedForDeletion` = FALSE AND ( (`Alarms`.`EndedAt` IS NULL AND `Alarms`.`Acknowledged` = FALSE) OR ( `Alarms`.`EndedAt` IS NOT NULL AND `Alarms`.`Pinned` = TRUE) )

查询计划:

id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE Alarms range Key2,Key3,Key4,Key5,Key6,KeyB KeyB 21 1574778 Using where; Using index

经过时间:1,763,222μs

Elapsed time: 1,763,222μs

在这种特殊情况下，查询(正确)甚至不匹配很多行(结果是 n = 2).

In this particular case the query (correctly) doesn't even match many rows (the result is n = 2).

利用我从使用索引合并中学到的东西(尽管我仍然没有做到这一点)，我尝试重新组织一下条件(原始条件是由一些 C++ 生成的，基于输入条件，因此奇怪的运算符分布):

Taking what I learnt from working with index merges (though I still haven't got that right), I tried reorganising the conditions a bit (the original was generated by some C++, based on input conditions, hence the strange operator distribution):

SELECT COUNT(`AlarmId`) AS `n` FROM `Alarms` WHERE ( `EndedAt` IS NULL AND `Acknowledged` = FALSE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000') ) OR ( `EndedAt` > FROM_UNIXTIME(1519101900) AND `Pinned` = TRUE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000') );

…但是结果是一样的.

…but the result is the same.

那么为什么需要这么长时间?如何修改它/索引以使其立即工作?

So why does it take so long? How can I modify it / the indexes to make it work instantly?

推荐答案

OR 是出了名的难以优化.
MySQL 几乎从不在一个查询中使用两个索引.

OR is notoriously hard to optimize.

MySQL almost never uses two indexes in a single query.

要避免这两种情况，请将 OR 转换为 UNION.每个 SELECT 可以使用不同的索引.因此，为每个构建一个最佳 INDEX.

To avoid both of those, turn OR into UNION. Each SELECT can use its a different index. So, build an optimal INDEX for each.

其实，既然你只是在做COUNT，你不妨分别计算两个单独的计数并相加.

Actually, since you are only doing COUNT, you may as well evaluate two separate counts and add them.

SELECT ( SELECT COUNT(*) FROM `Alarms` WHERE `EndedAt` IS NULL AND `Acknowledged` = FALSE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000' ) ) + ( SELECT COUNT(*) FROM `Alarms` WHERE `EndedAt` > FROM_UNIXTIME(1519101900) AND `Pinned` = TRUE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000') ) AS `n`; INDEX(DeviceId, Acknowledged, MarkedForDeletion, EndedAt, StartedAt) -- for first INDEX(DeviceId, Pinned, MarkedForDeletion, EndedAt, StartedAt) -- for second INDEX(DeviceId, Pinned, MarkedForDeletion, StartedAt, EndedAt) -- for second

好吧，如果有重叠，那就行不通了.那么，让我们回到 UNION 模式:

Well, that won't work if there is overlap. So, let's go back to the UNION pattern:

SELECT COUNT(*) AS `n` FROM ( ( SELECT AlarmId FROM `Alarms` WHERE `EndedAt` IS NULL AND `Acknowledged` = FALSE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000') ) UNION DISTINCT ( SELECT AlarmId FROM `Alarms` WHERE `EndedAt` > FROM_UNIXTIME(1519101900) AND `Pinned` = TRUE AND `StartedAt` < FROM_UNIXTIME(1519101900) AND `MarkedForDeletion` = FALSE AND `DeviceId` = UNHEX('00030000000000000000000000000000') ) );

再次添加这些索引.

每个 INDEX 中的前几列可以按任何顺序排列，因为它们是用 =(或 IS NULL)测试的.最后一两个是范围"测试.只有第一个范围将用于过滤，但我包含了另一列，以便索引将覆盖".

The first few columns in each INDEX can be in any order, since they are tested with = (or IS NULL). The last one or two are "range" tests. Only the first range will be used for filtering, but I included the other column so that the index would be "covering".

我的公式可能比索引合并"更好.

My formulations may be better than "index merge".

相关文章