在 MySQL 全文搜索中处理拼写错误的最佳方法

2022-01-15 00:00:00 lucene full-text-search php mysql sphinx

我在 mysql 数据库中有大约 2000 行.

I have about 2000 rows in a mysql database.

每行最多 300 个字符,包含一两个句子.

Each row is a max of 300 characters and contains a sentence or two.

我使用 mysql 内置的全文搜索来搜索这些行.

I use mysql's built in fulltext search to search these rows.

如果可能,我想添加一个功能,以便更正拼写错误和意外拼写错误.

I would like to add a feature so that typos and accidental mispellings are corrected, if possible.

例如,如果有人在搜索框中输入right shlder",则在执行搜索时,这将等同于右肩".

For example, if someone types "right shlder" into the searchbox, this would equate to "right shoulder" when performing the search.

您对添加此类功能的最简单方法有何建议?是否值得添加某种外部搜索引擎,例如 lucene?(对于这么小的数据集,这似乎有点过头了.)或者有没有更简单的方法?

What are your suggestions on the simplest way to add this kind of functionality? Is it worth adding an external search engine of some kind, like lucene? (It seems like for such a small dataset, this is overkill.) Or is there a simpler way?

推荐答案

我认为你应该使用 SOUNDS LIKESOUNDEX()

I think you should use SOUNDS LIKE or SOUNDEX()

由于您的数据集非常小,一种解决方案可能是创建一个新表来存储每个文本字段中包含的单个单词或 soundex 值,并在该表上使用 SOUNDS LIKE.

As your data set is so small, one solution may be to create a new table to store the individual words or soundex values contained in each text field and use SOUNDS LIKE on that table.

例如:

SELECT * FROM table where id IN 
(
    SELECT refid FROM tableofwords 
    WHERE column SOUNDS LIKE 'right' OR column SOUNDS LIKE 'shlder'
)

参见:http://dev.mysql.com/doc/refman/5.0/zh/string-functions.html

我相信用通配符搜索字符串是不可能的:(

I belive it is not possible to wild card seach the string :(

相关文章