针对 REGEXP 的 MySQL 优化

这个查询(使用不同的名称而不是jack")在我的慢查询日志中发生了很多次.为什么?

This query (with different name instead of "jack") happens many times in my slow query log. Why?

Users 表有很多字段(比我选择的这三个多)和大约 40.000 行.

The Users table has many fields (more than these three I've selected) and about 40.000 rows.

select name,username,id from Users where ( name REGEXP'[[:<:]]jack[[:>:]]' ) 或 ( 用户名 REGEXP '[[:<:]]jack[[:>:]]' )order by name limit 0,5;

id 是主要的和自增的.
name 有一个索引.
username 具有唯一索引.

id is primary and autoincrement.
name has an index.
username has a unique index.

有时需要 3 秒!如果我解释 MySQL 上的选择,我会得到这个:

Sometimes it takes 3 seconds! If I explain the select on MySQL I've got this:

select type: SIMPLE
table: Users
type: index
possible keys: NULL
key: name
key len: 452
ref: NULL
rows: 5
extra: Using where

这是我能做的最好的事情吗?我能解决什么问题?

Is this the best I can do? What can I fix?

推荐答案

如果你必须使用正则表达式风格的 WHERE 子句,你肯定会被慢查询问题所困扰.要使正则表达式样式的搜索工作,MySQL 必须将名称列中的每个值与正则表达式进行比较.而且,您的查询还查看了您的用户名列,因此麻烦增加了一倍.

If you must use regexp-style WHERE clauses, you definitely will be plagued by slow-query problems. For regexp-style search to work, MySQL has to compare every value in your name column with the regexp. And, your query has doubled the trouble by also looking at your username column.

这意味着 MySQL 无法利用任何索引,这就是所有 DBMS 加速大表查询的方式.

This means MySQL can't take advantage of any indexes, which is how all DBMSs speed up queries of large tables.

您可以尝试一些方法.所有这些都涉及向 REGEXP 说再见.

There are a few things you can try. All of them involve saying goodbye to REGEXP.

一个是:

WHERE name LIKE CONCAT('jack', '%') OR username LIKE CONCAT('jack', '%')

如果您在 name 和 username 列上创建索引,这应该很快.它将查找所有以jack"开头的名称/用户名.请注意

If you create indexes on your name and username columns this should be decently fast. It will look for all names/usernames beginning with 'jack'. NOTICE that

WHERE name LIKE CONCAT('%','jack') /* SLOW!!! */

将查找以jack"结尾的名称,但会像您的正则表达式式搜索一样缓慢.

will look for names ending with 'jack' but will be slow like your regexp-style search.

您可以做的另一件事是弄清楚为什么您的应用程序需要能够搜索名称或用户名的一部分.您可以从应用程序中消除此功能,或者想出一些更好的方法来处理它.

Another thing you can do is figure out why your application needs to be able to search for part of a name or username. You can either eliminate this feature from your application, or figure out some better way to handle it.

可能更好的方法:

  1. 让您的用户将他们的姓名分解为名和姓字段,然后分别进行搜索.
  2. 创建单独的搜索所有用户"功能,仅在用户需要时使用该功能,从而降低慢速正则表达式式查询的频率.
  3. 自己使用某种预处理程序将他们的名字分解成一个单独的名字词表.在没有正则表达式的情况下搜索 name-words 表.
  4. 弄清楚如何使用 MySQL 全文搜索来实现此功能.

所有这些都涉及一些编程工作.

All of these involve some programming work.

相关文章