无需QGIS，ArcGIS，PostGIS等的Python中的高效空间连接

31

我正在尝试进行空间连接，类似于此处的示例：是否存在“按位置连接属性”的python选项？。但是，这种方法似乎效率低下/缓慢。即使仅以250点的分数运行它，也要花费近2分钟的时间，而对于> 1,000点的shapefile，它完全失败。有没有更好的方法？我想完全在Python中完成此操作，而无需使用ArcGIS，QGIS等。

我也想知道是否有可能对一个多边形内所有点的属性（即总体）求和，然后将该数量加入到多边形shapefile中。

这是我要转换的代码。我在第9行出现错误：

poly['properties']['score'] += point['properties']['score']

其中说：

TypeError：+ =：'NoneType'和'float'不受支持的操作数类型。

如果我将“ + =”替换为“ =”，则可以正常运行，但不会对字段求和。我也尝试过将它们设置为整数，但这也失败了。

with fiona.open(poly_shp, 'r') as n: 
  with fiona.open(point_shp,'r') as s:
    outSchema = {'geometry': 'Polygon','properties':{'region':'str','score':'float'}}
    with fiona.open (out_shp, 'w', 'ESRI Shapefile', outSchema, crs) as output:
        for point in s:
            for poly in n:
                if shape(point['geometry']).within(shape(poly['geometry'])):  
                    poly['properties']['score']) += point['properties']['score'])
                    output.write({
                        'properties':{
                            'region':poly['properties']['NAME'],
                            'score':poly['properties']['score']},
                        'geometry':poly['geometry']})

spatial-join fiona typeerror

— 杰布菲舍尔
source

我认为您应该从这里开始编辑第二个问题，以便使这个问题集中在我认为对您来说更重要的问题上。另一个可以单独研究/询问。

— PolyGeo

37

Fiona返回Python词典，您不能poly['properties']['score']) += point['properties']['score'])与字典一起使用。

使用Mike T给出的引用对属性求和的示例：

在此处输入图片说明

# read the shapefiles 
import fiona
from shapely.geometry import shape
polygons = [pol for pol in fiona.open('poly.shp')]
points = [pt for pt in fiona.open('point.shp')]
# attributes of the polygons
for poly in polygons:
   print poly['properties'] 
OrderedDict([(u'score', 0)])
OrderedDict([(u'score', 0)])
OrderedDict([(u'score', 0)])

# attributes of the points
for pt in points:
    print i['properties']
 OrderedDict([(u'score', 1)]) 
 .... # (same for the 8 points)

现在，我们可以使用两种方法，带有或不带有空间索引：

1）没有

# iterate through points 
for i, pt in enumerate(points):
     point = shape(pt['geometry'])
     #iterate through polygons
     for j, poly in enumerate(polygons):
        if point.within(shape(poly['geometry'])):
             # sum of attributes values
             polygons[j]['properties']['score'] = polygons[j]['properties']['score'] + points[i]['properties']['score']

2）具有R树索引（可以使用 pyrtree或rtree）

# Create the R-tree index and store the features in it (bounding box)
 from rtree import index
 idx = index.Index()
 for pos, poly in enumerate(polygons):
       idx.insert(pos, shape(poly['geometry']).bounds)

#iterate through points
for i,pt in enumerate(points):
  point = shape(pt['geometry'])
  # iterate through spatial index
  for j in idx.intersection(point.coords[0]):
      if point.within(shape(multi[j]['geometry'])):
            polygons[j]['properties']['score'] = polygons[j]['properties']['score'] + points[i]['properties']['score']

两种解决方案的结果：

for poly in polygons:
   print poly['properties']    
 OrderedDict([(u'score', 2)]) # 2 points in the polygon
 OrderedDict([(u'score', 1)]) # 1 point in the polygon
 OrderedDict([(u'score', 1)]) # 1 point in the polygon

有什么不同？

如果没有索引，则必须遍历所有几何图形（多边形和点）。
使用边界空间索引（Spatial Index RTree），您仅循环访问有可能与当前几何图形相交的几何图形（“过滤器”可以节省大量计算和时间...）
但是空间索引不是魔杖。当必须检索数据集的很大一部分时，空间索引无法带来任何速度优势。

后：

schema = fiona.open('poly.shp').schema
with fiona.open ('output.shp', 'w', 'ESRI Shapefile', schema) as output:
    for poly in polygons:
        output.write(poly)

要走得更远，请看结合使用Rtree空间索引和OGR，Shapely，Fiona

— 基因
source

15

另外-geopandas现在可以选择包含rtree为依赖项，请参见github repo

因此，除了遵循上面的所有（非常好）代码之外，您还可以执行以下操作：

import geopandas
from geopandas.tools import sjoin
point = geopandas.GeoDataFrame.from_file('point.shp') # or geojson etc
poly = geopandas.GeoDataFrame.from_file('poly.shp')
pointInPolys = sjoin(point, poly, how='left')
pointSumByPoly = pointInPolys.groupby('PolyGroupByField')['fields', 'in', 'grouped', 'output'].agg(['sum'])

为了得到这个时髦的功能一定要安装在C-库libspatialindex第一

编辑：更正的包导入

— 粘土人
source

我的印象rtree是可选的。难道这不是意味着你需要安装rtree，以及在libspatialindexC-库？

— kuanb '17

已经有一段时间了，但是我认为当我从github rtree首次安装时自动添加了来自github的geopandas 时libspatialindex...他们做了

— 相当重要的

9

使用Rtree作为索引来执行更快的联接，然后使用Shapely进行空间谓词以确定点是否确实在多边形内。如果做得正确，这将比大多数其他GIS更快。

在此处或此处查看示例。

关于“ SUM”的问题的第二部分，使用一个dict对象以多边形ID为键来累积总体。虽然，使用PostGIS可以很好地完成这种事情。

— 迈克·T
source

谢谢@Mike T ...使用dict对象或PostGIS是很好的建议。但是，我对在哪里可以将Rtree合并到我的代码中还是有些困惑（上面包含的代码）。

— jburrfischer 2014年

1

此网页显示了如何在Shapely的更昂贵的内部空间查询之前使用边界框多边形中点搜索。

http://rexdouglass.com/fast-spatial-joins-in-python-with-a-spatial-index/

— 克莱维斯
source

感谢@klewis ...您是否有机会为第二部分提供帮助？为了对落在多边形内的点属性（例如总体）求和，我尝试了类似于以下代码的操作，但是它引发了错误。if shape（school ['geometry']）。within（shape（neighborhood ['geometry']））：

— inside

如果以“ r”模式打开邻居，则该邻居可能是只读的。两个shapefile是否都具有字段填充？哪条线抛出错误？祝好运。

— klewis

再次感谢您@klewis ...我已经在上面添加了代码并解释了错误。另外，我一直在玩rtree，我对将其添加到上面的代码中仍然感到困惑。对不起，真麻烦。

— jburrfischer 2014年

尝试此操作，似乎将None添加到int会导致错误。poly_score = poly ['properties'] ['score']）point_score = point ['properties'] ['score']）if point_score：if poly_score poly ['properties'] ['score']）+ = point_score else： poly ['properties'] ['score']）= point_score

— klewis 2014年