地理位置 MySQL 查询
我经营一个基于地理位置的社交网络.成员可以根据彼此的接近程度查看其他成员.
I run a geolocation-based social network. Members can see other members based on how close they are.
现在我的 MySQL 查询如下所示:
Right now my MySQL query looks like:
$lat_min = $geo['user_lat'] - 5;
$lat_max = $geo['user_lat'] + 5;
$long_min = $geo['user_long'] - 5;
$long_max = $geo['user_long'] + 5;
$members_query = "SELECT " . $fields . " FROM members WHERE (user_lat BETWEEN " . $lat_min . " AND " . $lat_max . " AND user_long BETWEEN " . $long_min . " AND " . $long_max . ") OR (gc_lat BETWEEN " . $lat_min . " AND " . $lat_max . " AND gc_long BETWEEN " . $long_min . " AND " . $long_max . ")";
user_lat
和 user_long
是基于地理位置的坐标,如果他们在浏览器中打开了它的话.gc_lat
和 gc_long
以及基于其 IP 地址的坐标.这些行都在数据库中编入索引.我将所有成员拉到 10 度以内.
user_lat
and user_long
are coordinates based on geolocation if they have it turned on in their browser. gc_lat
and gc_long
and coordinates based on their IP address. These rows are all indexed in the database. I am pulling all members within 10 degrees.
问题是我们的 250,000 多名成员执行此查询大约需要 2 秒,并且我们希望网站能够扩展.
The problem is this query takes about 2 seconds to perform for our 250,000+ members, and we want the site to scale.
ATTEMPT 2:我尝试将象限分配给每个成员,例如36x-99"...我将纬度和经度四舍五入到最接近的 3 的倍数以标记象限,然后我只拉取成员所在象限 12 度以内的象限.
ATTEMPT 2: I have tried assigning quadrants to each member e.g. "36x-99" ... I am rounding off the latitude and longitude to the nearest multiple of 3 to label the quadrant, and then I only pull quadrants within 12 degrees of the quadrant the member is in.
$members_query = "SELECT " . $fields . " FROM members WHERE quadrant IN ('36x-99', '33x-99', '30x-99', ...);
这对我的查询速度没有明显的影响.
This gave me no noticeable difference in query speed.
有人对我应该做什么有任何想法吗?我需要找到一个解决方案,让网站能够更好地扩展.
Anyone have any ideas of what I should do? I need to find a solution that will allow the site to scale better.
推荐答案
问题是您在数据库中存储数据的方式不适合您正在执行的任务类型.在 Geometry
数据点中使用 Point
值是可行的方法.实际上为此目的在 4 多年前编写了一些代码,但在找到它时遇到了问题.但是 这篇文章 似乎很好地涵盖了它.
The problem is the way you are storing data in the database is not suited for the type of task you are performing. Using Point
values in Geometry
data points is the way to go. Actually coded something 4+ years back for this purpose, but having issues finding it. But this post seems to cover it well.
编辑好的,找到了我的旧代码,但它指的是我显然无法共享的旧客户端数据.但在数据库中加快坐标速度的关键是使用存储在数据库表中的 GEOMETRY
类型的 POINT
数据.更多细节在这里官方 MySQL 网站.因为我需要一个理由来重新审视这种类型的代码和概念,所以这里有一个快速的 MySQL 脚本,我用示例数据创建了一个示例表来传达基本概念.一旦您了解正在发生的事情,就会有很多很酷的选择.
EDIT Okay, found my old code, but it’s referring to old client data that I obviously cannot share. But the key to speed with coordinates in databases is using POINT
data stored in the database table with the type of GEOMETRY
. More details here on the official MySQL site. Since I have needed a reason to revisit this type of code—and the concepts—for a while here is a quick MySQL script I whipped up to create a sample table with sample data to convey the basic concepts. Once you understand what is happening, it opens up lots of cool options.
还找到 这个伟大/简单的解释 的概念也是如此.
Also found this great/simple explanation of the concept as well.
发现MySQL 5.6 中对空间数据的另一个重要评估.很多关于索引的重要信息&表现.特别是关于 MySQL 空间索引性能:
And found another great assessment of spatial data in MySQL 5.6. Lots of great info on indexes & performance. Specifically regarding MySQL spatial index performance:
MyISAM 表支持空间索引,因此上述查询将使用这些索引.
MyISAM tables support Spatial indexes, so the above queries will use those indexes.
另一方面:
InnoDB 引擎不支持空间索引,因此这些查询会很慢.
The InnoDB engine does not support spatial indexes, so those queries will be slow.
这是我的基本 MySQL 测试脚本,用于帮助说明这个概念:
And here is my basic MySQL testing scripts to help illustrate the concept:
/* Create the database `spatial_test` */
CREATE DATABASE `spatial_test` CHARACTER SET utf8 COLLATE utf8_general_ci;
/* Create the table `locations` in `spatial_test` */
CREATE TABLE `spatial_test`.`locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`coordinates` point NOT NULL,
UNIQUE KEY `id` (`id`),
SPATIAL KEY `idx_coordinates` (`coordinates`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
/* Insert some test data into it. */
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(27.174961 78.041822)'));
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(27.985818 86.923596)'));
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(44.427963 -110.588455)'));
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(19.896766 -155.582782)'));
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(40.748328 -73.985560)'));
INSERT INTO `spatial_test`.`locations` (`id`, `coordinates`) VALUES (NULL, GeomFromText('POINT(40.782710 -73.965310)'));
/* A sample SELECT query that extracts the 'latitude' & 'longitude' */
SELECT x(`spatial_test`.`locations`.`coordinates`) AS latitude, y(`spatial_test`.`locations`.`coordinates`) AS longitude FROM `spatial_test`.`locations`;
/* Another sample SELECT query calculates distance of all items in database based on GLength using another set of coordinates. */
SELECT GLength(LineStringFromWKB(LineString(GeomFromText(astext(PointFromWKB(`spatial_test`.`locations`.`coordinates`))), GeomFromText(astext(PointFromWKB(POINT(40.782710,-73.965310))))))) AS distance
FROM `spatial_test`.`locations`
;
/* Yet another sample SELECT query that selects items by using the Earth’s radius. The 'HAVING distance < 100' equates to a distance of less than 100 miles or kilometers based on what you set the query for. */
/* Earth’s diameter in kilometers: 6371 */
/* Earth’s diameter in miles: 3959 */
SELECT id, (3959 * acos(cos(radians(40.782710)) * cos(radians(x(`spatial_test`.`locations`.`coordinates`))) * cos(radians(y(`spatial_test`.`locations`.`coordinates`)) - radians(-73.965310)) + sin(radians(40.782710)) * sin(radians(x(`spatial_test`.`locations`.`coordinates`))))) AS distance
FROM `spatial_test`.`locations`
HAVING distance < 100
ORDER BY id
;
相关文章