聊一聊MySQL索引失效的问题

2022-01-20 00:00:00 索引查询组合导致失效

一、索引原理

索引是提高数据库查询性能的一个重要方法。

使用索引用可快速找出某个列中包含特定值的行。不使用索引，必须从条记录开始读，可能要读完整个表，才能找出相关的行。

使用索引就像查字典一样，我们可以根据拼音、笔画、偏旁部首等排序的目录(索引)，快速查找到需要的字。

之前介绍MySQL存储引擎的文章（聊一聊MySQL的存储引擎），测试对比了两种存储引擎(MyISAM或者InnoDB)，使用主键索引查询，效率快了几十倍。

虽然索引大大提高了查询(select)速度，但同时也会降低更新(insert,update,delete)表的速度，因为更新表时，数据库不仅要更新和保存数据，还要更新和保存索引文件。当然索引文件也占用存储空间。

二、索引失效的场景

索引虽然加快了查询效率，但使用方法不当，就会出现索引失效。

下面实际操作，列举一些索引失效的场景。

CREATETABLE customer(id INTAUTO_INCREMENT,companyVARCHAR(30),nameVARCHAR(30),sex enum('male','female'),age INT,phoneVARCHAR(30),addressVARCHAR(60),PRIMARYKEY (id));
CREATEINDEX index_company ON customer(company);CREATEINDEX index_age ON customer(age);CREATEINDEX index_phone ON customer(phone);CREATEINDEX index_phone_name ON customer(phone,name);

上面创建了一个客户表，以及三个单列索引和一个组合索引，我们来查看一下索引：

mysql>show index from customer; # (因为排版问题，去掉了一些信息)+------------+------------------+--------------+-------------+------------+|Non_unique | Key_name         |Seq_in_index | Column_name | Index_type |+------------+------------------+--------------+-------------+------------+|          0 | PRIMARY          |           1 | id          | BTREE      ||          1 | index_company    |           1 | company     | BTREE      ||          1 | index_age        |            1 | age         | BTREE      ||          1 | index_phone      |            1 | phone       |BTREE      ||          1 | index_phone_name |            1 | phone       | BTREE      ||          1 | index_phone_name |            2 | name        | BTREE      |+------------+------------------+--------------+-------------+------------+

6 rows inset (0.01 sec)

上面显示包括一个默认的主键索引，还有三个单列索引，两个组合索引项。其中主键索引为(unique)索引，索引类别(Index_type)显示为BTREE，实际为B+树。

2.1 模糊匹配LIKE以%开头，会导致索引失效。

explain select *from customer where company like '%abc'\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys: NULL          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

应该把模糊匹配放到右边。

explain select * from customer where company like 'abc%'\G*************************** 1. row***************************           id: 1  select_type:SIMPLE        table:customer   partitions:NULL         type:rangepossible_keys: index_company          key:index_company      key_len: 123          ref:NULL         rows: 1     filtered:100.00        Extra:Using index condition

2.2 索引列进行计算，会导致索引失效。

explain select *from customer where age-1=20\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys: NULL          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

应该把运算放在右边。

explain select *from customer where age=20+1\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: refpossible_keys:index_age          key: index_age      key_len: 5          ref: const         rows: 1     filtered: 100.00        Extra: NULL

2.3 索引列使用函数，会导致索引失效。

explain select *from customer where lcase(company)='abc'\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys: NULL          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

2.4 索引列类型转换，会导致索引失效。

explain select *from customer where phone=13012345678\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys:index_phone,index_phone_name          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

应该保持原有类型，避免类型转换。

explain select *from customer where phone='13012345678'\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: refpossible_keys:index_phone,index_phone_name          key: index_phone      key_len: 123          ref: const         rows: 1     filtered: 100.00        Extra: NULL

2.5 索引列比较字符集不一致时，会导致索引失效。

按照上面客户表的格式，创建两个供应商的表supplier、supplier_utf8，字符集分别为utf8mb4、utf8。关于字符集可以看看之前文章(聊一聊MySQL的字符集)。

CREATETABLE supplier (id INTAUTO_INCREMENT,companyVARCHAR(30),nameVARCHAR(30),sexenum('male','female'),age INT,phoneVARCHAR(30),addressVARCHAR(60),PRIMARYKEY (id));

CREATEINDEX index_company ON supplier(company);CREATEINDEX index_age ON supplier(age);CREATEINDEX index_phone ON supplier(phone);CREATEINDEX index_phone_name ON supplier(phone,name);

CREATETABLE supplier_utf8 (id INTAUTO_INCREMENT,companyVARCHAR(30),nameVARCHAR(30),sexenum('male','female'),age INT,phoneVARCHAR(30),addressVARCHAR(60),PRIMARYKEY (id)) CHARSET=utf8;

CREATEINDEX index_company ON supplier_utf8(company);CREATEINDEX index_age ON supplier_utf8(age);CREATEINDEX index_phone ON supplier_utf8(phone);CREATEINDEX index_phone_name ON supplier_utf8(phone,name);

explain select *from customer,supplier where customer.company = supplier.company\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys:index_company          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where***************************2. row ***************************           id: 1  select_type: SIMPLE        table: supplier   partitions: NULL         type: refpossible_keys:index_company          key: index_company      key_len: 123          ref: mydb.customer.company         rows: 1     filtered: 100.00        Extra: NULL

可以看到字符集一样的，一个表使用了索引，另一个表走了全表扫描。

explain select *from customer,supplier_utf8 where customer.company = supplier_utf8.company\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys:index_company          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: NULL***************************2. row ***************************           id: 1  select_type: SIMPLE        table: supplier_utf8   partitions: NULL         type: ALLpossible_keys:index_company          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where; Using join buffer(hash join)

字符集不一样的，两个表都走了全表扫描。

这种情况隐蔽性比较强，经常业务上线了才被发现。

2.6 使用or查询，不论另一项是否为索引列，都会导致索引失效。

explainselect * from customer where company='abc' or address='abc'\Gexplainselect * from customer where company='abc' or age=20\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys:index_company          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

不过如果非要使用or进行查询，可以利用MRR功能，对回表查询进行排序优化。

2.7 组合索引，没有使用列索引，会导致索引失效。

explain select *from customer where name='abc'\G***************************1. row ***************************           id: 1  select_type: SIMPLE        table: customer   partitions: NULL         type: ALLpossible_keys: NULL          key: NULL      key_len: NULL          ref: NULL         rows: 1     filtered: 100.00        Extra: Using where

组合索引并不要求按照排列顺序，下面可以用到索引。

explain select * from customer where name ='abc' and phone='13012345678'\G*************************** 1. row***************************           id: 1  select_type:SIMPLE        table:customer   partitions:NULL         type: refpossible_keys: index_phone,index_phone_name          key:index_phone      key_len: 123          ref:const         rows: 1     filtered:100.00        Extra:Using where

如果数据库预计使用全表扫描比使用索引快，则不使用索引。MySQL新版本中，对索引判断有了很大改善，之前版本使用in,!= ，<>，is null，is not null会导致索引失效，新版测试，还是使用了索引查询。

MySQL5.6引入了MRR(Multi-Range Read Optimization)，专门来优化：二级索引的范围扫描并且需要回表的情况。

它的原理是，将多个需要回表的二级索引根据主键进行排序，然后一起回表，将原来的回表时进行的随机I/O，转变成顺序I/O，降低查询过程中的I/O开销，同时减少缓冲池中数据页被替换的频次。

MySQL5.7引入了Generated Column，如果确实需要对索引列进行计算等操作，可以采用虚拟列的方式来处理，比如创建客户表时增加对应的列，公司名称小写lcase_company，和实岁full_age, 并建立索引。

CREATE TABLE customer2(id INT AUTO_INCREMENT,company VARCHAR(30),lcase_company VARCHAR(30)as(lcase(company)),name VARCHAR(30),sex enum('male','female'),age INT,full_age INT as(age-1),phone VARCHAR(30),address VARCHAR(60),PRIMARY KEY (id));

CREATE INDEX index_lcase_company ON customer2(lcase_company);CREATE INDEX index_full_age ON customer2(full_age);

之后直接对Cenerated Column进行查询，就相当于计算列用到了索引。

三、索引的使用规范

在使用索引的时候，不仅要注意避免索引失效，也要遵循一定的规范，以便高效的使用索引。

下面总结了一些规范建议，可以用来参考，并非真理。

3.1 单表的索引数建议不超过5个，组合索引的字段原则上不超过3个。

3.2 尽量不要在较长字符串的字段上建立索引，可以设置索引字段前缀长度。

3.3 选择在查询过滤中使用率较高，如where，orderby，group by的列建立索引。

3.4 不要在区分度不高的列上建立索引，比如性别等，利用不了索引性能。

3.5 不要在经常更新的列上建立索引，数据更新也会更新索引，影响数据库性能。

3.6 建立组合索引时，区分度高，或者查询频率高的，放在左侧。

3.7 合理利用覆盖索引来满足查询要求，避免回表查询，减少I/O开销。

3.8 删除不再使用、少使用、或者重复的索引，减少数据更新的开销。

3.9 利用explain来判断查询语句，是使用了索引，还是走了全表扫描。

END

相关文章