需要帮助优化 mysql 的 lat/Lon 地理搜索

我有一个 mysql (5.0.22) myisam 表,其中大约有 300k 条记录,我想在 5 英里半径范围内进行纬度/经度距离搜索.

我有一个涵盖纬度/经度字段的索引,当我只选择纬度/经度时,它的速度很快(毫秒响应).但是当我选择表格中的其他字段时,速度会慢到 5-8 秒.

我正在使用 myisam 来利用全文搜索.其他索引表现良好(例如 select * from Listing where slug = 'xxxxx').

如何优化我的查询、表或索引以加快速度?

我的架构是:

CREATE TABLE `Listing` (`id` int(10) unsigned NOT NULL auto_increment,`name` varchar(125) 整理 utf8_unicode_ci 默认 NULL,`phone` varchar(18) 整理 utf8_unicode_ci 默认 NULL,`fax` varchar(18) 整理 utf8_unicode_ci 默认 NULL,`email` varchar(55) 整理 utf8_unicode_ci 默认 NULL,`photourl` varchar(55) 整理 utf8_unicode_ci 默认 NULL,`thumburl` varchar(5) 整理 utf8_unicode_ci 默认 NULL,`website` varchar(85) 整理 utf8_unicode_ci 默认 NULL,`categoryid` int(10) 无符号默认 NULL,`addressid` int(10) 无符号默认 NULL,`deleted` tinyint(1) 默认 NULL,`status` int(10) 无符号默认 '2',`parentid` int(10) 无符号默认 NULL,`organizationid` int(10) 无符号默认 NULL,`listinginfoid` int(10) 无符号默认 NULL,`createuserid` int(10) 无符号默认 NULL,`createdate` 日期时间默认 NULL,`lasteditdate` 时间戳 NOT NULL 默认 CURRENT_TIMESTAMP 更新 CURRENT_TIMESTAMP,`lastedituserid` int(10) 无符号默认 NULL,`slug` varchar(155) 整理 utf8_unicode_ci 默认 NULL,`aclid` int(10) 无符号默认 NULL,`alt_address` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`alt_website` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`lat` 十进制(10,7) 默认 NULL,`lon` 十进制(10,7) 默认 NULL,`city` varchar(80) 整理 utf8_unicode_ci 默认 NULL,`state` varchar(10) 整理 utf8_unicode_ci 默认 NULL,主键(`id`),KEY `idx_fetch` USING BTREE (`slug`,`deleted`),KEY `idx_loc` (`state`,`city`),KEY `idx_org` (`organizationid`,`status`,`deleted`),KEY `idx_geo_latlon` 使用 BTREE (`status`,`lat`,`lon`),FULLTEXT KEY `idx_name` (`name`)) 引擎=MyISAM 默认字符集=utf8 排序=utf8_unicode_ci ROW_FORMAT=DYNAMIC;

我的查询是:

选择 Listing.name、Listing.categoryid、Listing.lat、Listing.lon, 3956 * 2 * ASIN(SQRT( POWER(SIN((Listing.lat - 37.369195) * pi()/180/2), 2) + COS(Listing.lat * pi()/180) * COS(37.369195 *pi()/180) * POWER(SIN((Listing.lon --122.036849) * pi()/180/2), 2) )) rawgeosearchdistance从上市在哪里Listing.status = '2'AND(在 -122.10913433498 和 -121.96456366502 之间的 Listing.lon )AND(Listing.lat 介于 37.296909665016 和 37.441480334984 之间)HAVING rawgeosearchdistance <5ORDER BY rawgeosearchdistance ASC;

在没有地理搜索的情况下解释计划:

 +----+-------------+------------+--------+-----------------+-----------------+---------+------+------+--------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+--------------+|1 |简单 |上市 |范围 |idx_geo_latlon |idx_geo_latlon |19 |空 |第453章使用位置 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+--------------+

用地理搜索解释计划:

<上一页>+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+|1 |简单 |上市 |范围 |idx_geo_latlon |idx_geo_latlon |19 |空 |第453章使用哪里;使用文件排序 |+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+------------------------------+

这是带有覆盖索引的解释计划.以正确的顺序排列列会产生很大的不同:

<上一页>+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+|编号 |选择类型 |表|类型 |可能的键 |关键 |key_len |参考 |行 |额外 |+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+|1 |简单 |上市 |范围 |idx_geo_cover |idx_geo_cover |12 |空 |第453章使用哪里;使用索引;使用文件排序 |+----+-------------+--------+--------+---------------+----------------+----------+------+--------+---------------------------------------------------+

谢谢!

解决方案

您可能在仅 lat/lon 查询中使用了覆盖索引".当查询使用的索引包含您选择的数据时,就会出现覆盖索引.MySQL 只需要访问索引而不需要访问数据行.查看更多信息.这可以解释为什么纬度/经度查询如此之快.

我怀疑计算和返回的行数会减慢更长的查询速度.(加上必须为 having 子句创建的任何临时表).

I have a mysql (5.0.22) myisam table with roughly 300k records in it and I want to do a lat/lon distance search within a five mile radius.

I have an index that covers the lat/lon fields and is fast (milisecond response) when I just select for lat/lon. But when I select for additional fields in the table is slows down horribly to 5-8 seconds.

I'm using myisam to take advantage of fulltext search. The other indexes perform well (e.g. select * from Listing where slug = 'xxxxx').

How can I optimize my query, table or index to speed things up?

My schema is:

CREATE TABLE  `Listing` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `name` varchar(125) collate utf8_unicode_ci default NULL,
  `phone` varchar(18) collate utf8_unicode_ci default NULL,
  `fax` varchar(18) collate utf8_unicode_ci default NULL,
  `email` varchar(55) collate utf8_unicode_ci default NULL,
  `photourl` varchar(55) collate utf8_unicode_ci default NULL,
  `thumburl` varchar(5) collate utf8_unicode_ci default NULL,
  `website` varchar(85) collate utf8_unicode_ci default NULL,
  `categoryid` int(10) unsigned default NULL,
  `addressid` int(10) unsigned default NULL,
  `deleted` tinyint(1) default NULL,
  `status` int(10) unsigned default '2',
  `parentid` int(10) unsigned default NULL,
  `organizationid` int(10) unsigned default NULL,
  `listinginfoid` int(10) unsigned default NULL,
  `createuserid` int(10) unsigned default NULL,
  `createdate` datetime default NULL,
  `lasteditdate` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `lastedituserid` int(10) unsigned default NULL,
  `slug` varchar(155) collate utf8_unicode_ci default NULL,
  `aclid` int(10) unsigned default NULL,
  `alt_address` varchar(80) collate utf8_unicode_ci default NULL,
  `alt_website` varchar(80) collate utf8_unicode_ci default NULL,
  `lat` decimal(10,7) default NULL,
  `lon` decimal(10,7) default NULL,
  `city` varchar(80) collate utf8_unicode_ci default NULL,
  `state` varchar(10) collate utf8_unicode_ci default NULL,
  PRIMARY KEY  (`id`),
  KEY `idx_fetch` USING BTREE (`slug`,`deleted`),
  KEY `idx_loc` (`state`,`city`),
  KEY `idx_org` (`organizationid`,`status`,`deleted`),
  KEY `idx_geo_latlon` USING BTREE (`status`,`lat`,`lon`),
  FULLTEXT KEY `idx_name` (`name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=DYNAMIC;

My query is:

SELECT Listing.name, Listing.categoryid, Listing.lat, Listing.lon
, 3956 * 2 * ASIN(SQRT( POWER(SIN((Listing.lat - 37.369195) * pi()/180 / 2), 2) + COS(Listing.lat * pi()/180) * COS(37.369195 * pi()/180) * POWER(SIN((Listing.lon --122.036849) * pi()/180 / 2), 2) )) rawgeosearchdistance
FROM Listing
WHERE
    Listing.status = '2'
    AND ( Listing.lon between -122.10913433498 and -121.96456366502 )
    AND ( Listing.lat between 37.296909665016 and 37.441480334984)
HAVING rawgeosearchdistance < 5
ORDER BY rawgeosearchdistance ASC;

Explain plan without geosearch:

    +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+
    | id | select_type | table      | type  | possible_keys   | key             | key_len |ref | rows | Extra       |
    +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+
    |  1 | SIMPLE      | Listing    | range | idx_geo_latlon  | idx_geo_latlon  | 19      | NULL |  453 | Using where |
    +----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-------------+

Explain plan with geosearch:

+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+
| id | select_type | table      | type  | possible_keys   | key             | key_len | ref  | rows | Extra                       |
+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+
|  1 | SIMPLE      | Listing    | range | idx_geo_latlon  | idx_geo_latlon  | 19      | NULL |  453 | Using where; Using filesort |
+----+-------------+------------+-------+-----------------+-----------------+---------+------+------+-----------------------------+

Here's the explain plan with the covering index. Having the columns in the correct order made a big difference:

+----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+
| id | select_type | table  | type  | possible_keys | key           | key_len | ref  | rows   | Extra                                    |
+----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+
|  1 | SIMPLE      | Listing | range | idx_geo_cover | idx_geo_cover | 12      | NULL | 453     | Using where; Using index; Using filesort |
+----+-------------+--------+-------+---------------+---------------+---------+------+--------+------------------------------------------+

Thank you!

解决方案

You are probably using a 'covering index' in your lat/lon only query. A covering index occurs when the index used by the query contains the data that you are selecting for. MySQL only needs to visit the index and never the data rows. See this for more info. That would explain why the lat/lon query is so fast.

I suspect that the calculations and the sheer number of rows returned, slows down the longer query. (plus any temp table that has to be created for the having clause).

相关文章