如何有效访问QgsSpatialIndex返回的功能?


9

PyQGIS食谱介绍如何设置空间索引,但只说明了它的用法的一半:

创建空间索引—以下代码创建一个空索引

index = QgsSpatialIndex()

向索引添加功能-索引接受QgsFeature对象,并将其添加到内部数据结构中。您可以手动创建对象,也可以使用上一个调用提供程序的nextFeature()的对象。

index.insertFeature(feat)

一旦空间索引中填充了一些值,就可以进行一些查询

# returns array of feature IDs of five nearest features
nearest = index.nearestNeighbor(QgsPoint(25.4, 12.7), 5)

获取属于返回的特征ID的实际特征的最有效步骤是什么?

Answers:


12
    # assume a list of feature ids returned from index and a QgsVectorLayer 'lyr'
    fids = [1, 2, 4]
    request = QgsFeatureRequest()
    request.setFilterFids(fids)

    features = lyr.getFeatures(request)
    # can now iterate and do fun stuff:
    for feature in features:
        print feature.id(), feature

    1 <qgis._core.QgsFeature object at 0x000000000E987510>
    2 <qgis._core.QgsFeature object at 0x000000000E987400>
    4 <qgis._core.QgsFeature object at 0x000000000E987510>

谢谢!Snorfalorpagus提到setFilterFids将比他发布的解决方案慢得多。你确认吗?
Underdark

我没有在大型结果集上使用它,因此无法确认。
gsherman

1
我确认,在我的情况下,RTREE甚至快于QgsSpatialIndex()(用于建筑的平面图,从非常大的折线层,模块的换位PlanarGraph与身材匀称的PyQGIS,但随着菲奥娜,身材匀称和RTREE的解决方案仍是最快)
基因

1
我认为问题在于从返回的特征ID中获取实际特征,而不是各种索引方法的速度。
gsherman

7

有关此主题的博客文章中,Nathan Woodrow提供了以下代码:

layer = qgis.utils.iface.activeLayer()

# Select all features along with their attributes
allAttrs = layer.pendingAllAttributesList()
layer.select(allAttrs)
# Get all the features to start
allfeatures = {feature.id(): feature for (feature) in layer}

def noindex():
    for feature in allfeatures.values():
        for f in allfeatures.values():
            touches = f.geometry().touches(feature.geometry())
            # It doesn't matter if we don't return anything it's just an example

def withindex():
    # Build the spatial index for faster lookup.
    index = QgsSpatialIndex()
    map(index.insertFeature, allfeatures.values())

    # Loop each feature in the layer again and get only the features that are going to touch.
    for feature in allfeatures.values():
        ids = index.intersects(feature.geometry().boundingBox())
        for id in ids:
            f = allfeatures[id]
            touches = f.geometry().touches(feature.geometry())
            # It doesn't matter if we don't return anything it's just an example

import timeit
print "With Index: %s seconds " % timeit.timeit(withindex,number=1)
print "Without Index: %s seconds " % timeit.timeit(noindex,number=1)

这将创建一个字典,使您可以使用其FID快速查找QgsFeature。

我发现对于非常大的图层,这不是特别实用,因为它需要大量内存。但是,使用替代方法(对所需功能的随机访问)layer.getFeatures(QgsFeatureRequest().setFilterFid(fid))似乎相对较慢。我不确定为什么会这样,因为使用SWIG OGR绑定的等效调用layer.GetFeature(fid)似乎比这快得多。


1
使用字典是非常速度远远超过layer.getFeatures(QgsFeatureRequest().setFilterFid(fid))。我正在处理具有140k要素的图层,而140k查找的总时间从几分钟到几秒钟。
哈瓦德·特维特(HåvardTveite)2015年

5

为了进行比较,请查看不使用QGIS,ArcGIS,PostGIS等的Python中的“ 更高效的空间连接”。提出的解决方案使用Python模块FionaShapelyrtree(空间索引)。

与PyQGIS和相同的示例两层,point以及polygon

在此处输入图片说明

1)没有空间索引:

polygons = [feature for feature in polygon.getFeatures()]
points = [feature for feature in point.getFeatures()]
for pt in points: 
    point = pt.geometry()
    for pl  in polygons:
        poly = pl.geometry()
        if poly.contains(point):
            print point.asPoint(), poly.asPolygon()
(184127,122472) [[(183372,123361), (184078,123130), (184516,122631),   (184516,122265), (183676,122144), (183067,122570), (183128,123105), (183372,123361)]]
(183457,122850) [[(183372,123361), (184078,123130), (184516,122631), (184516,122265), (183676,122144), (183067,122570), (183128,123105), (183372,123361)]]
(184723,124043) [[(184200,124737), (185368,124372), (185466,124055), (185515,123714), (184955,123580), (184675,123471), (184139,123787), (184200,124737)]]
(182179,124067) [[(182520,125175), (183348,124286), (182605,123714), (182252,123544), (181753,123799), (181740,124627), (182520,125175)]]

2)使用R-Tree PyQGIS空间索引:

# build the spatial index with all the polygons and not only a bounding box
index = QgsSpatialIndex()
for poly in polygons:
     index.insertFeature(poly)

# intersections with the index 
# indices of the index for the intersections
for pt in points:
    point = pt.geometry()
    for id in index.intersects(point.boundingBox()):
    print id
0
0
1
2

这些指数是什么意思?

for i, pt in enumerate(points):
     point = pt.geometry()
     for id in index.intersects(point.boundingBox()):
        print "Point ", i, points[i].geometry().asPoint(), "is in Polygon ", id, polygons[id].geometry().asPolygon()
Point  1 (184127,122472) is in Polygon  0 [[(182520,125175), (183348,124286), (182605,123714), (182252,123544), (181753,123799), (181740,124627), (182520,125175)]]
Point  2 (183457,122850) is in Polygon  0 [[(182520,125175), (183348,124286), (182605,123714), (182252,123544), (181753,123799), (181740,124627), (182520,125175)]]
Point  4 (184723,124043) is in Polygon  1 [[(182520,125175), (183348,124286), (182605,123714), (182252,123544), (181753,123799), (181740,124627), (182520,125175)]]
Point  6 (182179,124067) is in Polygon  2 [[(182520,125175), (183348,124286), (182605,123714), (182252,123544), (181753,123799), (181740,124627), (182520,125175)]]

没有QGIS,ArcGIS,PostGIS等的Python中的“ 更高效的空间连接”中的结论相同:

  • 如果没有和索引,则必须遍历所有几何(多边形和点)。
  • 使用边界空间索引(QgsSpatialIndex()),您仅循环访问有可能与当前几何图形相交的几何图形(“过滤器”可以节省大量计算和时间...)。
  • 您还可以将其他空间索引Python模块(rtreePyrtreeQuadtree)与PyQGIS一起使用,如使用QGIS空间索引来加快代码(通过QgsSpatialIndex()和rtree
  • 但是空间索引不是魔杖。当必须检索数据集的很大一部分时,空间索引无法带来任何速度优势。

GIS se中的另一个示例:如何在QGIS中找到最接近点的线?[重复]


感谢您的所有其他解释。基本上,您的解决方案使用列表而不是Snorfalorpagus的字典。因此,实际上似乎没有layer.getFeatures([ids])函数……
underdark

此说明的目的纯粹是几何的,并且很容易添加layer.getFeatures([ids])函数,如不带QGIS,ArcGIS,PostGIS等的Python
基因

0

显然,获得良好性能的唯一方法是避免或捆绑对layer.getFeatures()的调用,即使过滤器与fid一样简单。

现在,这里有个陷阱:调用getFeatures非常昂贵。如果在向量层上调用它,则要求QGIS建立与数据存储(层提供程序)的新连接,创建一些查询以返回数据,并解析从提供程序返回的每个结果。这可能会很慢,尤其是当您正在使用某种类型的远程层时,例如通过VPN连接的PostGIS表。

来源:http : //nyalldawson.net/2016/10/speeding-up-your-pyqgis-scripts/

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.