确定点是否位于重叠多边形内的简单方法。

huangapple go评论73阅读模式
英文:

simple approach to determine if points intersect overlapping polygons

问题

以下是您提供的内容的翻译:

我有一个包含纬度和经度坐标的简单数据框(我提供了一个子集)。我想确定这些点是否落在一个或两个多边形内(第二个多边形完全覆盖第一个)。

点的子样本

pts <- structure(list(Longitude = c(-26.775, -27.3167, -30.2717, -30.7267, -37.8683, -38.2733,
-44.3033, -44.8233, -57.42, -57.8717, -52.235, -52.54,
-54.2317, -54.6883, -63.6033, -65.415, -66.31, -66.7533),
Latitude = c(62.5183, 62.3, 60.8367, 60.59, 56.6383, 56.3917, 52.8, 52.45,
45.56, 45.46, 47.3917, 47.1317, 46.26, 46.16, 44.29, 43.2633,
43.0983, 43.0267)),
row.names = c(NA, -18L), class = c("tbl_df", "tbl", "data.frame"))

将点表示为空间对象

sp <- SpatialPoints(pts, proj4string = CRS("+proj=longlat +datum=WGS84"))

多边形 A 和 B

pA <- st_polygon(list(as.matrix(data.frame(lon = c(-40, -40, -64, -64, -54, -54, -40),
lat = c(45, 61, 61, 51, 51, 45, 45))))) %>%
st_sfc(crs = st_crs("+proj=longlat +datum=WGS84"))
pB <- st_polygon(list(as.matrix(data.frame(lon = c(-6, -6, -50, -50, -64, -64, -54, -54, -6),
lat = c(45, 66, 66, 70, 70, 51, 51, 45, 45))))) %>%
st_sfc(crs = st_crs("+proj=longlat +datum=WGS84"))

我知道可以使用 st_intersects 来确定点是否在单个多边形内重叠,但我不知道如何扩展到多个重叠的多边形。理想情况下,最终结果将是一个变量,如果点在 A 中,则返回 A,如果点属于不与 A 重叠的较大的 B 多边形,则返回 B,如果点既不在 A 也不在 B 中,则返回 NA。

我发现使用 tidyverse(而不是 datatable)的代码更容易理解,所以我希望有一个使用这些函数的解决方案。

英文:

I have a simple dataframe with latitude and longitude coordinates (I've provided a subset below). I'd like to determine if any of the points fall within one or both polygons (the second polygon completely overlaps the first).

# subsample of points
pts &lt;- structure(list(Longitude = c(-26.775, -27.3167, -30.2717, -30.7267, -37.8683, -38.2733, 
                                    -44.3033, -44.8233, -57.42, -57.8717, -52.235, -52.54, 
                                    -54.2317, -54.6883, -63.6033, -65.415, -66.31, -66.7533), 
                      Latitude = c(62.5183, 62.3, 60.8367, 60.59, 56.6383, 56.3917, 52.8, 52.45, 
                                   45.56, 45.46, 47.3917, 47.1317, 46.26, 46.16, 44.29, 43.2633, 
                                   43.0983, 43.0267)), 
                 row.names = c(NA, -18L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

# points as a spatial object
sp &lt;- SpatialPoints(pts, proj4string = CRS(&quot;+proj=longlat +datum=WGS84&quot;))

# polygon A &amp; B
pA &lt;- st_polygon(list(as.matrix(data.frame(lon = c(-40, -40, -64, -64, -54, -54, -40),
                                              lat = c(45, 61, 61, 51, 51, 45, 45))))) %&gt;%
  st_sfc(crs = st_crs(&quot;+proj=longlat +datum=WGS84&quot;))
pB&lt;- st_polygon(list(as.matrix(data.frame(lon = c(-6, -6, -50, -50, -64, -64, -54, -54, -6),
                                                 lat = c(45, 66, 66, 70, 70, 51, 51, 45, 45))))) %&gt;%
  st_sfc(crs = st_crs(&quot;+proj=longlat +datum=WGS84&quot;))

I know I can use st_intersects to determine if points overlap in a single polygon, but I don't know how to expand this if when you have multiple overlapping polygons. Ideally, the end result would be a variable that returns A if the point is just in A, B if the point is part of the larger B polygon that does not overlap A, and NA if the point is not in either polygon.

I find it much easier to follow code that uses tidyverse (rather than datatable), so I'd prefer a solution with those functions.

答案1

得分: 2

If you have collected your polygons into sf object and find point intersects with that, st_intersects() will return a "sparse geometry binary predicate list",它包含每个点的交叉几何索引的列表:

library(sf)
library(sp)
library(dplyr)
library(ggplot2)

### 移动到 sf 对象,假设没有部分重叠,按大小排列多边形
ab_sf <- st_sf(geometry = c(pA, pB), row.names = c("pA", "pB")) %>%
  tibble::rownames_to_column("poly") %>%
  mutate(size_rank = rank(desc(st_area(geometry)))) %>%
  arrange(size_rank)
ab_sf
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -64 ymin: 45 xmax: -6 ymax: 70
#> Geodetic CRS:  +proj=longlat +datum=WGS84
#>   poly size_rank                       geometry
#> 1   pB         1 POLYGON ((-6 45, -6 66, -50...
#> 2   pA         2 POLYGON ((-40 45, -40 61, -...

points_sf <- st_as_sf(sp_)

### 查找交叉点,返回的列表将包括交叉索引的有序列表
### 例如仅为 pA 的 (1);为 pA 和 pB 的 (1,2);(空) 表示无交叉,
### 从每个中提取最后一项,以获取最小交叉多边形的索引
isect_last_idx <- st_intersects(points_sf, ab_sf) %>% sapply(last)
isect_last_idx
#>  [1]  1  1  1  1  1  1  2  2 NA NA  2  2 NA NA NA NA NA NA

## 将索引映射到多边形名称
points_sf$poly_name <- ab_sf[["poly"]][isect_last_idx]
points_sf
#> Simple feature collection with 18 features and 1 field
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -66.7533 ymin: 43.0267 xmax: -26.775 ymax: 62.5183
#> Geodetic CRS:  +proj=longlat +datum=WGS84 +no_defs
#> First 10 features:
#>                    geometry poly_name
#> 1   POINT (-26.775 62.5183)        pB
#> 2     POINT (-27.3167 62.3)        pB
#> 3  POINT (-30.2717 60.8367)        pB
#> 4    POINT (-30.7267 60.59)        pB
#> 5  POINT (-37.8683 56.6383)        pB
#> 6  POINT (-38.2733 56.3917)        pB
#> 7     POINT (-44.3033 52.8)        pA
#> 8    POINT (-44.8233 52.45)        pA
#> 9      POINT (-57.42 45.56)      <NA>
#> 10   POINT (-57.8717 45.46)      <NA>

ab_sf %>%
  arrange(size_rank) %>%
  ggplot() +
  geom_sf(aes(fill = poly, linetype = poly, linewidth = poly), alpha = .8) +
  scale_linewidth_manual(values = c(0,2)) +
  geom_sf(data = points_sf,  aes(shape = poly_name), size =  5) +
  scale_shape_manual(na.value = 1, values = c(9,13)) +
  theme_minimal()

确定点是否位于重叠多边形内的简单方法。

使用的数据:


pts <- structure(list(Longitude = c(-26.775, -27.3167, -30.2717, -30.7267, -37.8683, -38.2733, 
-44.3033, -44.8233, -57.42, -57.8717, -52.235, -52.54, 
-54.2317, -54.6883, -63.6033, -65.415, -66.31, -66.7533), 
Latitude = c(62.5183, 62.3, 60.8367, 60.59, 56.6383, 56.3917, 52.8, 52.45, 
45.56, 45.46, 47.3917, 47.1317, 46.26, 46.16, 44.29, 43.2633, 
43.0983, 43.0267)), 
row.names = c(NA, -18L), class = c("tbl_df", "tbl", "data.frame"))
# points as a spatial object
sp_ <- SpatialPoints(pts, proj4string = CRS("+proj=longlat +datum=WGS84"))
# polygon A & B
pA <- st_polygon(list(as.matrix(data.frame(lon = c(-40, -40, -64, -64, -54, -54, -40),
lat = c(45, 61, 61, 51, 51, 45, 45))))) %>%
st_sfc(crs = st_crs("+proj=longlat +datum=WGS84"))
pB<- st_polygon(list(as.matrix(data.frame(lon = c(-6, -6, -50, -50, -64, -64, -54, -54, -6),
lat = c(45, 66, 66, 70, 70, 51, 51, 45, 45))))) %>%
st_sfc(crs = st_crs("+proj=longlat +datum=WGS84"))

创建于 2023-05-25,使用 reprex v2.0.2

英文:

If you have collected your polygons into sf object and find point intersects with that, st_intersects() will return a "sparse geometry binary predicate list", it includes a list of intersecting geometry indices for each point:

Sparse geometry binary predicate list of length 18, where the predicate was `intersects&#39;
first 10 elements:
1: 1
2: 1
3: 1
4: 1
5: 1
6: 1
7: 1, 2
8: 1, 2
9: (empty)
10: (empty)

If your polygons are ordered correctly in sf, last item for every point will be the polygon with highest row number. Well, if you just want know if point intersects with 1, 2, (,n) or no polygons, all you really need is to get the lengths() of that list.

library(sf)
library(sp)
library(dplyr)
library(ggplot2)

### move to sf objects, assume that there are no partial overlaps,
### arrange polygons by their size
ab_sf &lt;- st_sf(geometry = c(pA, pB), row.names = c(&quot;pA&quot;, &quot;pB&quot;)) %&gt;% 
  tibble::rownames_to_column(&quot;poly&quot;) %&gt;% 
  mutate(size_rank = rank(desc(st_area(geometry)))) %&gt;%  
  arrange(size_rank)
ab_sf
#&gt; Simple feature collection with 2 features and 2 fields
#&gt; Geometry type: POLYGON
#&gt; Dimension:     XY
#&gt; Bounding box:  xmin: -64 ymin: 45 xmax: -6 ymax: 70
#&gt; Geodetic CRS:  +proj=longlat +datum=WGS84
#&gt;   poly size_rank                       geometry
#&gt; 1   pB         1 POLYGON ((-6 45, -6 66, -50...
#&gt; 2   pA         2 POLYGON ((-40 45, -40 61, -...

points_sf &lt;- st_as_sf(sp_)

### find intersects, returned list will include ordered list of intersecting indices
### i.e (1) for only pA; (1,2) for pA and pB; (empty) for no intersect,
### extract last item from each to get index of smallest intersecting polygon
isect_last_idx &lt;- st_intersects(points_sf, ab_sf) %&gt;% sapply(last)
isect_last_idx
#&gt;  [1]  1  1  1  1  1  1  2  2 NA NA  2  2 NA NA NA NA NA NA

## map indeces to polygon names
points_sf$poly_name &lt;- ab_sf[[&quot;poly&quot;]][isect_last_idx]
points_sf
#&gt; Simple feature collection with 18 features and 1 field
#&gt; Geometry type: POINT
#&gt; Dimension:     XY
#&gt; Bounding box:  xmin: -66.7533 ymin: 43.0267 xmax: -26.775 ymax: 62.5183
#&gt; Geodetic CRS:  +proj=longlat +datum=WGS84 +no_defs
#&gt; First 10 features:
#&gt;                    geometry poly_name
#&gt; 1   POINT (-26.775 62.5183)        pB
#&gt; 2     POINT (-27.3167 62.3)        pB
#&gt; 3  POINT (-30.2717 60.8367)        pB
#&gt; 4    POINT (-30.7267 60.59)        pB
#&gt; 5  POINT (-37.8683 56.6383)        pB
#&gt; 6  POINT (-38.2733 56.3917)        pB
#&gt; 7     POINT (-44.3033 52.8)        pA
#&gt; 8    POINT (-44.8233 52.45)        pA
#&gt; 9      POINT (-57.42 45.56)      &lt;NA&gt;
#&gt; 10   POINT (-57.8717 45.46)      &lt;NA&gt;

ab_sf %&gt;% 
  arrange(size_rank) %&gt;% 
  ggplot() +
  geom_sf(aes(fill = poly, linetype = poly, linewidth = poly), alpha = .8) +
  scale_linewidth_manual(values = c(0,2)) +
  geom_sf(data = points_sf,  aes(shape = poly_name), size =  5) +
  scale_shape_manual(na.value = 1, values = c(9,13)) +
  theme_minimal()

确定点是否位于重叠多边形内的简单方法。<!-- -->

Used data:


pts &lt;- structure(list(Longitude = c(-26.775, -27.3167, -30.2717, -30.7267, -37.8683, -38.2733, 
-44.3033, -44.8233, -57.42, -57.8717, -52.235, -52.54, 
-54.2317, -54.6883, -63.6033, -65.415, -66.31, -66.7533), 
Latitude = c(62.5183, 62.3, 60.8367, 60.59, 56.6383, 56.3917, 52.8, 52.45, 
45.56, 45.46, 47.3917, 47.1317, 46.26, 46.16, 44.29, 43.2633, 
43.0983, 43.0267)), 
row.names = c(NA, -18L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))
# points as a spatial object
sp_ &lt;- SpatialPoints(pts, proj4string = CRS(&quot;+proj=longlat +datum=WGS84&quot;))
# polygon A &amp; B
pA &lt;- st_polygon(list(as.matrix(data.frame(lon = c(-40, -40, -64, -64, -54, -54, -40),
lat = c(45, 61, 61, 51, 51, 45, 45))))) %&gt;%
st_sfc(crs = st_crs(&quot;+proj=longlat +datum=WGS84&quot;))
pB&lt;- st_polygon(list(as.matrix(data.frame(lon = c(-6, -6, -50, -50, -64, -64, -54, -54, -6),
lat = c(45, 66, 66, 70, 70, 51, 51, 45, 45))))) %&gt;%
st_sfc(crs = st_crs(&quot;+proj=longlat +datum=WGS84&quot;))

<sup>Created on 2023-05-25 with reprex v2.0.2</sup>

答案2

得分: 2

# 将两个多边形组合成一个 sf 对象
combined_poly <- st_sf(geometry = c(pA, pB), row.names = c("pA", "pB")) 
# 将点转换为 sf 对象
points_as_sf <- st_as_sf(sp)
st_intersects(points_as_sf, combined_poly, sparse = F)
#       [,1]  [,2]
#  [1,] FALSE  TRUE
#  [2,] FALSE  TRUE
#  [3,] FALSE  TRUE
#  [4,] FALSE  TRUE
#  [5,] FALSE  TRUE
#  [6,] FALSE  TRUE
#  [7,]  TRUE  TRUE
#  [8,]  TRUE  TRUE
#  [9,] FALSE FALSE
# [10,] FALSE FALSE
# [11,]  TRUE  TRUE
# [12,]  TRUE  TRUE
# [13,] FALSE FALSE
# [14,] FALSE FALSE
# [15,] FALSE FALSE
# [16,] FALSE FALSE
# [17,] FALSE FALSE
# [18,] FALSE FALSE
case_when(st_intersects(points_as_sf, combined_poly, sparse = F)[, 1] ~ "A",
st_intersects(points_as_sf, combined_poly, sparse = F)[, 2] ~ "B",
T ~ NA)
# [1] "B" "B" "B" "B" "B" "B" "A" "A" NA  NA  "A" "A" NA  NA  NA  NA  NA  NA 
英文:

In addition to the answer by @margusl building on the (default) sparse matrix form of sf::st_intersects() you can consider the alternative non-sparse output = a dense logical matrix.

While it is not as computationally efficient as sparse output I have found it easier to work with in many use cases.

Consider this piece of code; what it does is:

  • combines your polygons to a single sf object
  • casts your sp points to sf format
  • performs intersection of your two objects; the output is a logical matrix / points as rows, polygons as columns

The logical output produced by sf::st_intersects() is then passed to dplyr::case_when() which creates the variable - A if intersection with small polygon is true, B if intersection with big polygon is true and NA as a fallback default in all other cases.

This output can be easily passed to a vector variable or a new column in the points data frame.

# two polygons to one sf object
combined_poly &lt;- st_sf(geometry = c(pA, pB), row.names = c(&quot;pA&quot;, &quot;pB&quot;)) 
# points as sf object
points_as_sf &lt;- st_as_sf(sp)
st_intersects(points_as_sf, combined_poly, sparse = F)
#       [,1]  [,2]
#  [1,] FALSE  TRUE
#  [2,] FALSE  TRUE
#  [3,] FALSE  TRUE
#  [4,] FALSE  TRUE
#  [5,] FALSE  TRUE
#  [6,] FALSE  TRUE
#  [7,]  TRUE  TRUE
#  [8,]  TRUE  TRUE
#  [9,] FALSE FALSE
# [10,] FALSE FALSE
# [11,]  TRUE  TRUE
# [12,]  TRUE  TRUE
# [13,] FALSE FALSE
# [14,] FALSE FALSE
# [15,] FALSE FALSE
# [16,] FALSE FALSE
# [17,] FALSE FALSE
# [18,] FALSE FALSE
case_when(st_intersects(points_as_sf, combined_poly, sparse = F)[, 1] ~ &quot;A&quot;,
st_intersects(points_as_sf, combined_poly, sparse = F)[, 2] ~ &quot;B&quot;,
T ~ NA)
# [1] &quot;B&quot; &quot;B&quot; &quot;B&quot; &quot;B&quot; &quot;B&quot; &quot;B&quot; &quot;A&quot; &quot;A&quot; NA  NA  &quot;A&quot; &quot;A&quot; NA  NA  NA  NA  NA  NA 

huangapple
  • 本文由 发表于 2023年5月26日 01:12:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76334776.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定