从sp::over帮助中:
 x = "SpatialPoints", y = "SpatialPolygons" returns a numeric
      vector of length equal to the number of points; the number is
      the index (number) of the polygon of ‘y’ in which a point
      falls; NA denotes the point does not fall in a polygon; if a
      point falls in multiple polygons, the last polygon is
      recorded.
因此,如果将转换SpatialPolygonsDataFrame为,SpatialPolygons则会返回索引向量,并且可以将点子集中在NA:
> over(pts,as(ply,"SpatialPolygons"))
  [1] NA  1  1 NA  1  1 NA NA  1  1  1 NA NA  1  1  1  1  1 NA NA NA  1 NA  1 NA
 [26]  1  1  1 NA NA NA NA NA  1  1 NA NA NA  1  1  1 NA  1  1  1 NA NA NA  1  1
 [51]  1 NA NA NA  1 NA  1 NA  1 NA NA  1 NA  1  1 NA  1  1 NA  1 NA  1  1  1  1
 [76]  1  1  1  1  1 NA NA NA  1 NA  1 NA NA NA NA  1  1 NA  1 NA NA  1  1  1 NA
> nrow(pts)
[1] 100
> pts = pts[!is.na(over(pts,as(ply,"SpatialPolygons"))),]
> nrow(pts)
[1] 54
> head(pts@data)
         var1 var2
2  0.04001092    v
3  0.58108350    v
5  0.85682609    q
6  0.13683264    y
9  0.13968804    m
10 0.97144627    o
> 
对于怀疑者,以下证据表明转换开销不是问题:
有两个功能-首先是Jeffrey Evans的方法,然后是我的原始方法,然后是被黑的转换,然后是gIntersects基于Josh O'Brien的答案的版本:   
evans <- function(pts,ply){
  prid <- over(pts,ply)
  ptid <- na.omit(prid) 
  pt.poly <- pts[as.numeric(as.character(row.names(ptid))),]
  return(pt.poly)
}
rowlings <- function(pts,ply){
  return(pts[!is.na(over(pts,as(ply,"SpatialPolygons"))),])
}
rowlings2 <- function(pts,ply){
  class(ply) <- "SpatialPolygons"
  return(pts[!is.na(over(pts,ply)),])
}
obrien <- function(pts,ply){
pts[apply(gIntersects(columbus,pts,byid=TRUE),1,sum)==1,]
}
现在举一个真实的例子,我在columbus数据集上散布了一些随机点:
require(spdep)
example(columbus)
pts=data.frame(
    x=runif(100,5,12),
    y=runif(100,10,15),
    z=sample(letters,100,TRUE))
coordinates(pts)=~x+y
看起来不错
plot(columbus)
points(pts)
检查功能是否在做相同的事情:
> identical(evans(pts,columbus),rowlings(pts,columbus))
[1] TRUE
并运行500次以进行基准测试:
> system.time({for(i in 1:500){evans(pts,columbus)}})
   user  system elapsed 
  7.661   0.600   8.474 
> system.time({for(i in 1:500){rowlings(pts,columbus)}})
   user  system elapsed 
  6.528   0.284   6.933 
> system.time({for(i in 1:500){rowlings2(pts,columbus)}})
   user  system elapsed 
  5.952   0.600   7.222 
> system.time({for(i in 1:500){obrien(pts,columbus)}})
  user  system elapsed 
  4.752   0.004   4.781 
根据我的直觉,这不是很大的开销,实际上,与将所有行索引转换为字符并返回或运行na.omit以获取缺失值相比,此开销可能更少。顺带导致另一种故障模式evans功能的 ...
如果多边形数据帧的一行全部NA(完全有效),则该SpatialPolygonsDataFrame多边形中带有for点的覆盖将产生一个输出数据帧all NA,然后evans()将其丢弃:
> columbus@data[1,]=rep(NA,20)
> columbus@data[5,]=rep(NA,20)
> columbus@data[17,]=rep(NA,20)
> columbus@data[15,]=rep(NA,20)
> set.seed(123)
> pts=data.frame(x=runif(100,5,12),y=runif(100,10,15),z=sample(letters,100,TRUE))
> coordinates(pts)=~x+y
> identical(evans(pts,columbus),rowlings(pts,columbus))
[1] FALSE
> dim(evans(pts,columbus))
[1] 27  1
> dim(rowlings(pts,columbus))
[1] 28  1
> 
gIntersects即使必须扫描矩阵以检查R而不是C代码中的交点,BUT 还是更快。我怀疑它的prepared geometryGEOS技能,可以创建空间索引-是的,prepared=FALSE它需要更长的时间,大约5.5秒。
我很惊讶,没有一个函数可以直接返回索引或点。当我splancs20年前写的时候,多边形点函数同时具有...