ST_Distance不使用索引进行空间查询


10

即使对于最简单的查询,我也无法在PostgreSQL 9.3.5上运行PostGIS 2.1来使用空间索引。在整个数据集为800万点(人口数电网从这里)。该表创建为

CREATE TABLE points (
    population DOUBLE PRECISION NOT NULL,
    location GEOGRAPHY(4326, POINT) NOT NULL
)
CREATE INDEX points_gix ON points USING GIST(location);

查询很简单

SELECT SUM(population)
FROM points
WHERE ST_Distance(
    location,
    ST_GeographyFromText('SRID=4326; POINT(0 0)')
) < 1000

PostgreSQL总是使用Seq扫描,我尝试了10000点的子集-仍然是Seq扫描。有任何想法吗?


3
您不使用任何可以使用索引的函数。请改用st_dwithin。然后功能将首先进行索引扫描。
NicklasAvén2014年

想想你的查询正在进行-计算距离从每个点在表中一个固定的点-你就会明白为什么没有索引可以使用。而是使用可以使用索引的运算符,例如ST_DWithin
Vince

Answers:


19

ST_Distance实际上计算所有成对点之间的距离,因此不能使用索引。因此,您的查询将进行序列扫描,然后选择小于您指定距离的那些几何。您正在寻找确实使用索引的ST_DWithin

SELECT SUM(population) FROM points 
WHERE ST_DWithin(location, ST_GeographyFromText('SRID=4326; POINT(0 0)'), 1000);

ST_Distance对于将结果(通常与ORDER BY和/或LIMIT结合使用)进行排序更为有用,该结果是通过使用索引的查询获得的。


1
谢谢。我真的应该在问问题之前先阅读文档。
突触

1
哇!谢谢!由于将st_distance更改为st_dwithin,您刚刚“加速”了我的慢速查询,例如100倍或更多倍。(我说“加速”是因为如果我再小心一点,这本来就不会发生的)
Hendy Irawan

1
@HendyIrawan。别客气。这是一个容易犯的错误。
约翰·鲍威尔,

@JohnPowellakaBarça我添加了另一个优化(尽管非常有损,我为我的案例添加了答案),但是您确实指出了正确的方向,谢谢。
Hendy Irawan

4

正如@JohnPowellakaBarça所说ST_DWithin()的那样,当您想要正确时,应该走的路。

但是,在我的情况下,我只想进行粗略的估算,因此ST_DWithin()对于我的需求来说,这甚至太贵了(查询成本)。我用&&ST_Expand(box2d)(不要误以为geometry版本)。例:

SELECT * FROM profile
  WHERE
    address_point IS NOT NULL AND
    address_point && CAST(ST_Expand(CAST(ST_GeomFromText(:point) AS box2d), 0.5) AS geometry;

显而易见的是,我们正在处理度数而不是米,并且在球体中使用边界框而不是圆形。对于我的用例,这从24毫秒减少到仅2毫秒(本地在SSD中)。但是,对于我的具有并发连接和几乎没有ST_DWithin()足够的IOPS 配额(100 IOPS)的AWS RDS PostgreSQL中的生产数据库,原始查询花费过多的IOPS,并且可能执行超过2000毫秒,并且当IOPS配额耗尽时,执行情况会更糟。

这并不适合每个人,但是如果您为了速度而牺牲一些精度(或节省IOPS),那么这种方法可能适合您。从下面的查询计划中可以看到,ST_DWithin除了重新检查条件外,位图堆扫描中还需要空间过滤器,而&&在盒子上的几何不需要过滤器,仅使用重新检查条件。

我还注意到这很IS NOT NULL重要,没有它,您将得到更糟糕的查询计划。看来GIST索引还不够“聪明”。(当然,如果您的列是NOT NULL,则不需要NULL

ST_DWithin(geography, geography, 100000, FALSE)具有300 IOPS的AWS RDS 512 MB RAM上的20000行表:

Aggregate  (cost=4.61..4.62 rows=1 width=8) (actual time=2011.358..2011.358 rows=1 loops=1)
  ->  Bitmap Heap Scan on matchprofile  (cost=2.83..4.61 rows=1 width=0) (actual time=1735.025..2010.635 rows=1974 loops=1)
        Recheck Cond: (((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography)))
        Filter: (((status)::text = 'ACTIVE'::text) AND ((gender)::text = 'MALE'::text) AND (((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography) AND ('0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography && _st_expand(address_point, '100000'::double precision)) AND _st_dwithin(address_point, '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography, '100000'::double precision, false)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography) AND ('0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography && _st_expand(hometown_point, '100000'::double precision)) AND _st_dwithin(hometown_point, '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography, '100000'::double precision, false))))
        Rows Removed by Filter: 3323
        Heap Blocks: exact=7014
        ->  BitmapOr  (cost=2.83..2.83 rows=1 width=0) (actual time=1716.425..1716.425 rows=0 loops=1)
              ->  Bitmap Index Scan on ik_matchprofile_address_point  (cost=0.00..1.42 rows=1 width=0) (actual time=1167.698..1167.698 rows=16086 loops=1)
                    Index Cond: ((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography))
              ->  Bitmap Index Scan on ik_matchprofile_hometown_point  (cost=0.00..1.42 rows=1 width=0) (actual time=548.723..548.723 rows=7846 loops=1)
                    Index Cond: ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography))
Planning time: 47.366 ms
Execution time: 2011.429 ms

20000行表,&&以及ST_Expand(box2d)在具有300 IOPS的AWS RDS 512 MB RAM上:

Aggregate  (cost=3.85..3.86 rows=1 width=8) (actual time=584.346..584.346 rows=1 loops=1)
  ->  Bitmap Heap Scan on matchprofile  (cost=2.83..3.85 rows=1 width=0) (actual time=555.048..584.083 rows=1154 loops=1)
        Recheck Cond: (((address_point IS NOT NULL) AND (address_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography)))
        Filter: (((status)::text = 'ACTIVE'::text) AND ((gender)::text = 'MALE'::text))
        Rows Removed by Filter: 555
        Heap Blocks: exact=3812
        ->  BitmapOr  (cost=2.83..2.83 rows=1 width=0) (actual time=553.091..553.091 rows=0 loops=1)
              ->  Bitmap Index Scan on ik_matchprofile_address_point  (cost=0.00..1.42 rows=1 width=0) (actual time=413.074..413.074 rows=4850 loops=1)
                    Index Cond: ((address_point IS NOT NULL) AND (address_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography))
              ->  Bitmap Index Scan on ik_matchprofile_hometown_point  (cost=0.00..1.42 rows=1 width=0) (actual time=140.014..140.014 rows=3100 loops=1)
                    Index Cond: ((hometown_point IS NOT NULL) AND (hometown_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography))
Planning time: 0.673 ms
Execution time: 584.386 ms

再次使用更简单的查询:

ST_DWithin(geography, geography, 100000, FALSE)具有300 IOPS的AWS RDS 512 MB RAM上的20000行表:

Aggregate  (cost=4.60..4.61 rows=1 width=8) (actual time=36.448..36.448 rows=1 loops=1)
  ->  Bitmap Heap Scan on matchprofile  (cost=2.83..4.60 rows=1 width=0) (actual time=7.694..35.545 rows=2982 loops=1)
        Recheck Cond: (((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography)))
        Filter: (((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography) AND ('0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography && _st_expand(address_point, '100000'::double precision)) AND _st_dwithin(address_point, '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography, '100000'::double precision, true)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography) AND ('0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography && _st_expand(hometown_point, '100000'::double precision)) AND _st_dwithin(hometown_point, '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography, '100000'::double precision, true)))
        Rows Removed by Filter: 2322
        Heap Blocks: exact=2947
        ->  BitmapOr  (cost=2.83..2.83 rows=1 width=0) (actual time=7.197..7.197 rows=0 loops=1)
              ->  Bitmap Index Scan on ik_matchprofile_address_point  (cost=0.00..1.41 rows=1 width=0) (actual time=5.265..5.265 rows=5680 loops=1)
                    Index Cond: ((address_point IS NOT NULL) AND (address_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography))
              ->  Bitmap Index Scan on ik_matchprofile_hometown_point  (cost=0.00..1.41 rows=1 width=0) (actual time=1.930..1.930 rows=2743 loops=1)
                    Index Cond: ((hometown_point IS NOT NULL) AND (hometown_point && '0101000020E6100000744694F606E75A40D49AE61DA7A81BC0'::geography))
Planning time: 0.479 ms
Execution time: 36.512 ms

20000行表,&&以及ST_Expand(box2d)在具有300 IOPS的AWS RDS 512 MB RAM上:

Aggregate  (cost=3.84..3.85 rows=1 width=8) (actual time=6.263..6.264 rows=1 loops=1)
  ->  Bitmap Heap Scan on matchprofile  (cost=2.83..3.84 rows=1 width=0) (actual time=4.295..5.864 rows=1711 loops=1)
        Recheck Cond: (((address_point IS NOT NULL) AND (address_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography)) OR ((hometown_point IS NOT NULL) AND (hometown_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography)))
        Heap Blocks: exact=1419
        ->  BitmapOr  (cost=2.83..2.83 rows=1 width=0) (actual time=4.122..4.122 rows=0 loops=1)
              ->  Bitmap Index Scan on ik_matchprofile_address_point  (cost=0.00..1.41 rows=1 width=0) (actual time=3.018..3.018 rows=1693 loops=1)
                    Index Cond: ((address_point IS NOT NULL) AND (address_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography))
              ->  Bitmap Index Scan on ik_matchprofile_hometown_point  (cost=0.00..1.41 rows=1 width=0) (actual time=1.102..1.102 rows=980 loops=1)
                    Index Cond: ((hometown_point IS NOT NULL) AND (hometown_point && '0103000020E61000000100000005000000744694F606C75A40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A819C0744694F606075B40D49AE61DA7A819C0744694F606075B40D49AE61DA7A81DC0744694F606C75A40D49AE61DA7A81DC0'::geography))
Planning time: 0.399 ms
Execution time: 6.306 ms

1
写得好,有趣。
约翰·鲍威尔
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.