英文:
How to Use "tbl_graphs" in R?
问题
我正在使用R编程语言工作。
基于一个shapefile,我试图可视化/找出两个坐标之间的驾驶距离(例如,CN塔和多伦多皮尔逊机场)。
首先我加载了shapefile:
library(sf)
library(rgdal)
library(sfnetworks)
library(igraph)
library(dplyr)
library(tidygraph)
# 设置shapefile的URL
url <- "https://www12.statcan.gc.ca/census-recensement/2011/geo/RNF-FRR/files-fichiers/lrnf000r22a_e.zip"
# 创建一个临时文件夹来下载和解压缩shapefile
temp_dir <- tempdir()
temp_file <- file.path(temp_dir, "lrnf000r22a_e.zip")
# 下载shapefile到临时文件夹
download.file(url, temp_file)
# 从下载的zip文件中解压shapefile
unzip(temp_file, exdir = temp_dir)
# 使用rgdal包读取shapefile
# 来源:https://gis.stackexchange.com/questions/456748/strategies-for-working-with-large-shapefiles/456798#456798
a = st_read(file.path(temp_dir, "lrnf000r22a_e.shp"), query="select * from lrnf000r22a_e where PRUID_R ='35'")
shapefile看起来像这样:
包含570,706个要素和21个字段的简单要素集
几何类型:LINESTRING
维度:XY
边界框:xmin:5963148 ymin:665490.8 xmax:7581671 ymax:2212179
投影的CRS:NAD83 / 加拿大统计局兰伯特
前10个要素:
OBJECTID NGD_UID NAME TYPE DIR AFL_VAL ATL_VAL AFR_VAL ATR_VAL CSDUID_L CSDNAME_L CSDTYPE_L CSDUID_R CSDNAME_R CSDTYPE_R PRUID_L PRNAME_L PRUID_R PRNAME_R RANK
1 4 3434819 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3526003 Fort Erie T 3526003 Fort Erie T 35 Ontario 35 Ontario 5
2 5 1358252 South LINE <NA> <NA> <NA> <NA> <NA> 3551027 Gordon/Barrie Island MU 3551027 Gordon/Barrie Island MU 35 Ontario 35 Ontario 5
3 8 1778927 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3512054 Wollaston TP 3512054 Wollaston TP 35 Ontario 35 Ontario 5
4 11 731932 Albion RD <NA> <NA> <NA> 2010 2010 3520005 Toronto C 3520005 Toronto C 35 Ontario 35 Ontario 3
5 18 3123617 County 41 RD <NA> 640 708 635 709 3511015 Greater Napanee T 3511015 Greater Napanee T 35 Ontario 35 Ontario 3
6 20 4814160 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3553005 Greater Sudbury / Grand Sudbury CV 3553005 Greater Sudbury / Grand Sudbury CV 35 Ontario 35 Ontario 5
7 21 1817031 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3537028 Amherstburg T 3537028 Amherstburg T 35 Ontario 35 Ontario 5
8 24 4825761 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3554094 Timiskaming, Unorganized, West Part NO 3554094 Timiskaming, Unorganized, West Part NO 35 Ontario 35 Ontario 5
9 25 544891 Dunelm DR <NA> 1 9 2 10 3526053 St. Catharines CY 3526053 St. Catharines CY 35 Ontario 35 Ontario 5
10 28 1835384 Seven Oaks DR <NA> 730 974 731 975 3515005 Otonabee-South Monaghan TP 3515005 Otonabee-South Monaghan TP 35 Ontario 35 Ontario 5
CLASS _ogr_geometry_
1 23 LINESTRING (7269806 859183,...
2 23 LINESTRING (6921247 1133452...
3 23 LINESTRING (7320857 1089403..
然后我尝试通过将道路网络视为图来计算距离:
# 将shapefile转换为sfnetwork对象
net <- as_sfnetwork(a)
# 在WGS84(EPSG:4326)中定义两个兴趣点
pt1 <- st_point(c(-79.61203, 43.68312))
pt2 <- st_point(c(-79.61203, 43.64256))
# 将点的CRS设置为WGS84(EPSG:4326)
pt1 <- st_sfc(pt1, crs = 4326)
pt2 <- st_sfc(pt2, crs = 4326)
# 将点转换为网络的CRS
pt1_transformed <- st_transform(pt1, st_crs(net))
pt2_transformed <- st_transform(pt2, st_crs(net))
# 找到变换后的兴趣点在网络上的最近点
n1 <- st_nearest_feature(pt1_transformed, net)
n2 <- st_nearest_feature(pt2_transformed, net)
从这里,我创建了一个“tbl_graph”:
path <- net %>%
activate(edges) %>%
mutate(weight = edge_length()) %>%
as_tbl_graph()
> path
<details>
<summary>英文:</summary>
I am working with the R programming language.
**Based on a shapefile, I am trying to visualize/find out the driving distance between two coordinates (e.g. CN Tower and Toronto Pearson Airport).**
First I loaded the shapefile:
library(sf)
library(rgdal)
library(sfnetworks)
library(igraph)
library(dplyr)
library(tidygraph)
# Set the URL for the shapefile
url <- "https://www12.statcan.gc.ca/census-recensement/2011/geo/RNF-FRR/files-fichiers/lrnf000r22a_e.zip"
# Create a temporary folder to download and extract the shapefile
temp_dir <- tempdir()
temp_file <- file.path(temp_dir, "lrnf000r22a_e.zip")
# Download the shapefile to the temporary folder
download.file(url, temp_file)
# Extract the shapefile from the downloaded zip file
unzip(temp_file, exdir = temp_dir)
# Read the shapefile using the rgdal package
# source: https://gis.stackexchange.com/questions/456748/strategies-for-working-with-large-shapefiles/456798#456798
a = st_read(file.path(temp_dir, "lrnf000r22a_e.shp"), query="select * from lrnf000r22a_e where PRUID_R ='35'")
The shapefile looks something like this:
Simple feature collection with 570706 features and 21 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 5963148 ymin: 665490.8 xmax: 7581671 ymax: 2212179
Projected CRS: NAD83 / Statistics Canada Lambert
First 10 features:
OBJECTID NGD_UID NAME TYPE DIR AFL_VAL ATL_VAL AFR_VAL ATR_VAL CSDUID_L CSDNAME_L CSDTYPE_L CSDUID_R CSDNAME_R CSDTYPE_R PRUID_L PRNAME_L PRUID_R PRNAME_R RANK
1 4 3434819 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3526003 Fort Erie T 3526003 Fort Erie T 35 Ontario 35 Ontario 5
2 5 1358252 South LINE <NA> <NA> <NA> <NA> <NA> 3551027 Gordon/Barrie Island MU 3551027 Gordon/Barrie Island MU 35 Ontario 35 Ontario 5
3 8 1778927 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3512054 Wollaston TP 3512054 Wollaston TP 35 Ontario 35 Ontario 5
4 11 731932 Albion RD <NA> <NA> <NA> 2010 2010 3520005 Toronto C 3520005 Toronto C 35 Ontario 35 Ontario 3
5 18 3123617 County 41 RD <NA> 640 708 635 709 3511015 Greater Napanee T 3511015 Greater Napanee T 35 Ontario 35 Ontario 3
6 20 4814160 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3553005 Greater Sudbury / Grand Sudbury CV 3553005 Greater Sudbury / Grand Sudbury CV 35 Ontario 35 Ontario 5
7 21 1817031 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3537028 Amherstburg T 3537028 Amherstburg T 35 Ontario 35 Ontario 5
8 24 4825761 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 3554094 Timiskaming, Unorganized, West Part NO 3554094 Timiskaming, Unorganized, West Part NO 35 Ontario 35 Ontario 5
9 25 544891 Dunelm DR <NA> 1 9 2 10 3526053 St. Catharines CY 3526053 St. Catharines CY 35 Ontario 35 Ontario 5
10 28 1835384 Seven Oaks DR <NA> 730 974 731 975 3515005 Otonabee-South Monaghan TP 3515005 Otonabee-South Monaghan TP 35 Ontario 35 Ontario 5
CLASS _ogr_geometry_
1 23 LINESTRING (7269806 859183,...
2 23 LINESTRING (6921247 1133452...
3 23 LINESTRING (7320857 1089403..
Then I tried to calculate the distance by treating the road network as a graph:
# Convert the shapefile to an sfnetwork object
net <- as_sfnetwork(a)
# Define the two points of interest in WGS84 (EPSG:4326)
pt1 <- st_point(c(-79.61203, 43.68312))
pt2 <- st_point(c(-79.61203, 43.64256))
# Set the CRS of the points to WGS84 (EPSG:4326)
pt1 <- st_sfc(pt1, crs = 4326)
pt2 <- st_sfc(pt2, crs = 4326)
# Transform the points to the CRS of the network
pt1_transformed <- st_transform(pt1, st_crs(net))
pt2_transformed <- st_transform(pt2, st_crs(net))
# Find the nearest points on the network to the transformed points of interest
n1 <- st_nearest_feature(pt1_transformed, net)
n2 <- st_nearest_feature(pt2_transformed, net)
From here, I create a "tbl_graph":
path <- net %>%
activate(edges) %>%
mutate(weight = edge_length()) %>%
as_tbl_graph()
> path
# A tbl_graph: 430824 nodes and 570706 edges
#
# A directed multigraph with 442 components
#
# Edge Data: 570,706 x 25 (active)
from to OBJECTID NGD_UID NAME TYPE DIR AFL_VAL ATL_VAL AFR_VAL ATR_VAL CSDUID_L CSDNAME_L CSDTYPE~ CSDUID~ CSDNAME_R CSDTYP~ PRUID_L PRNAME~ PRUID_R PRNAME~ RANK CLASS `_ogr_geometry_` weight
<int> <int> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <LINESTRING [m]> [m]
1 1 2 4 3434819 NA NA NA NA NA NA NA 3526003 Fort Erie T 3526003 Fort Erie T 35 Ontario 35 Ontario 5 23 (7269806 859183, 7269795 859166.~ 25.9
2 3 4 5 1358252 South LINE NA NA NA NA NA 3551027 Gordon/Bar~ MU 3551027 Gordon/Ba~ MU 35 Ontario 35 Ontario 5 23 (6921247 1133452, 6921258 113345~ 1245.
3 5 6 8 1778927 NA NA NA NA NA NA NA 3512054 Wollaston TP 3512054 Wollaston TP 35 Ontario 35 Ontario 5 23 (7320857 1089403, 7320903 108942~ 254.
4 7 8 11 731932 Albion RD NA NA NA 2010 2010 3520005 Toronto C 3520005 Toronto C 35 Ontario 35 Ontario 3 21 (7202555 935281.3, 7202653 93533~ 111.
5 9 10 18 3123617 County 41 RD NA 640 708 635 709 3511015 Greater Na~ T 3511015 Greater N~ T 35 Ontario 35 Ontario 3 21 (7403627 1039328, 7403431 103963~ 367.
6 11 12 20 4814160 NA NA NA NA NA NA NA 3553005 Greater Su~ CV 3553005 Greater S~ CV 35 Ontario 35 Ontario 5 21 (7042806 1222708, 7042838 122273~ 191.
# ... with 570,700 more rows
#
# Node Data: 430,824 x 1
`_ogr_geometry_`
<POINT [m]>
1 (7269806 859183)
2 (7269790 859162.7)
3 (6921247 1133452)
# ... with 430,821 more rows
**My Question:** From here, I am not sure how to use the tbl_graph to find the distance between the two points n1 and n2:
> n1
[1] 110393
> n2
[1] 319271
I think n1 and n2 correspond to the "to" and "from" columns in the "path" tbl_graph - but I am not sure how to use the tbl_graph to find out the distance between n1 and n2.
Can someone please show me how to do this?
Thanks!
</details>
# 答案1
**得分**: 4
`n1` 和 `n2` 是一个名为 `net` 的 `tibble` 的行号索引。这些值可以用于子集化图列表,或者作为 `igraph::shortest_paths()` 函数的 `from` 和 `to` 参数。在这种情况下,如果你使用 `n1` 和 `n2` 来查找最短路径,你可能会看到一些相当奇怪的结果,或者可能只是一个空列表。
此外,`sfnetworks` 对象 `net` 继承自 `tbl_graph` 和 `igraph`,因此 `tiynetworks` 和 `igraph` 的方法应该在不需要转换的情况下正常工作。将 `net` 转换为 `path`,并添加权重,不会更接近所需的路径。它仍然是几乎与之前相同的 `tbl_graph`,只是没有了空间图层。`sfnetworks` 还实现了满足你的主要目标所需的所有内容:`st_network_paths()` 和/或 `to_spatial_shortest_paths()` 转换器。
`sfnetwork` 也被配置为有向(这是默认设置),这很可能会破坏所有的路由尝试,至少在这种情况下。
`st_network_paths()` 方法示例:
``` r
library(sf)
library(sfnetworks)
library(mapview)
library(dplyr)
library(ggplot2)
# ...(代码的其余部分,已经在上面提供了)
igraph
方法示例:
library(igraph)
# ...(代码的其余部分,已经在上面提供了)
tidygraph
和 sfnetworks
的 morphers 示例:
library(tidygraph)
# ...(代码的其余部分,已经在上面提供了)
请注意,以上示例代码中包含了其他内容,只提供了与翻译请求相关的部分。
英文:
n1
& n2
are indeces, row numbers of one of the net
tibbles. Those values can be used to subset graph lists or used as from
and to
parameters for igraph::shortest_paths()
. In this case, n1
& n2
probably targeted edges tibble as nodes were not activated first. Means that if you used those n1
& n2
for finding shortest path, you probably witnessed some rather funky results. Or perhaps just an empty list.
Also, the sfnetworks
object, net
, inherits from tbl_graph
and also igraph
, so tiynetworks
and igraph
methods should work just fine on net
without converting it first. net
-> path
conversion, while adding weights, leads no closer to desired route. It's still (almost) the same tbl_graph
it was before, now without a spatial layer. sfnetworks
also implements all what's needed for your main objective: st_network_paths()
and/or to_spatial_shortest_paths()
morpher.
sfnetwork
was also configured to be directed (well, it's a default setting), this most likely will ruin all routing attempts, at least in this case.
sfnetworks::st_network_paths()
approach
library(sf)
library(sfnetworks)
library(mapview)
library(dplyr)
library(ggplot2)
shp_url <- "https://www12.statcan.gc.ca/census-recensement/2011/geo/RNF-FRR/files-fichiers/lrnf000r22a_e.zip"
if (!file.exists(basename(shp_url))) curl::curl_download(shp_url, basename(shp_url), quiet = FALSE)
# keep archive zipped;
# predefined bbox, small patch in Toronto,
# covers cover both points and leaves plenty of routing options
a <- read_sf(paste0("/vsizip/", basename(shp_url)),
wkt_filter = "POLYGON ((7200615 918857, 7211499 918857, 7211499 933387, 7200615 933387, 7200615 918857))")
a
#> Simple feature collection with 7584 features and 21 fields
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 7199959 ymin: 918285.7 xmax: 7211930 ymax: 934030.6
#> Projected CRS: NAD83 / Statistics Canada Lambert
#> # A tibble: 7,584 × 22
#> OBJECTID NGD_UID NAME TYPE DIR AFL_VAL ATL_VAL AFR_VAL ATR_VAL CSDUID_L
#> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 668 4668376 Dixie RD <NA> 2520 2520 2531 2543 3521005
#> 2 874 4617171 401 HWY <NA> <NA> <NA> <NA> <NA> 3521005
#> 3 2252 4092570 Breton AVE <NA> 178 300 175 295 3521005
#> 4 3320 485941 Kipling AVE <NA> <NA> <NA> <NA> <NA> 3520005
#> 5 3698 4775771 Eringa… DR <NA> 99 105 108 120 3520005
#> 6 3700 507598 Bristol RD W <NA> <NA> <NA> <NA> 3521005
#> 7 4472 4683515 Airport RD <NA> <NA> <NA> <NA> <NA> 3521005
#> 8 4759 5779033 401 HWY <NA> <NA> <NA> <NA> <NA> 3520005
#> 9 5051 505570 Grand … RD <NA> 3496 3530 3511 3525 3521005
#> 10 5533 5876974 Yorkda… CRES <NA> 31 67 30 58 3520005
#> # ℹ 7,574 more rows
#> # ℹ 12 more variables: CSDNAME_L <chr>, CSDTYPE_L <chr>, CSDUID_R <chr>,
#> # CSDNAME_R <chr>, CSDTYPE_R <chr>, PRUID_L <chr>, PRNAME_L <chr>,
#> # PRUID_R <chr>, PRNAME_R <chr>, RANK <chr>, CLASS <chr>,
#> # geometry <LINESTRING [m]>
# from / to points for routing:
pts <- tribble(~point, ~lon, ~lat,
"pt1", -79.61203, 43.68312,
"pt2", -79.61203, 43.64256) %>%
st_as_sf(coords =c("lon", "lat"), crs = st_crs(4326)) %>%
st_transform(st_crs(a))
# create un-directed network (default is directed),
# store edge lenghts as weights
net <- as_sfnetwork(a, directed = FALSE) %>%
activate("edges") %>%
mutate(weight = edge_length())
# get paths between pt1 & pt2 with sfnetwork
paths_sfn <- st_network_paths(net,
from = pts[pts$point == "pt1", ],
to = pts[pts$point == "pt2", ],
weights = "weight")
paths_sfn
#> # A tibble: 1 × 2
#> node_paths edge_paths
#> <list> <list>
#> 1 <int [54]> <int [53]>
# includes list of edge indeces that form the path:
paths_sfn$edge_paths[[1]]
#> [1] 1027 1032 1295 697 121 2704 2702 2778 1125 1280 4364 1360 235 1686 2048
#> [16] 6115 2266 865 2508 7511 7179 2956 1823 5139 1055 5929 3684 3417 5960 1196
#> [31] 3383 1603 1975 1159 6250 5814 58 1964 3593 4423 3477 4940 383 5784 2362
#> [46] 4120 7251 1970 4834 5155 624 1935 5134
# get edges with matching indeces:
route <- net %>%
activate("edges") %>%
slice(paths_sfn$edge_paths[[1]]) %>%
st_as_sf()
# distance:
sum(route$weight)
#> 7505.43 [m]
# visualize
mapview(list(pts, route))
<!-- -->
Path-finding and distance with igraph
, using the same net
object
library(igraph)
# to find node indeces, keep net nodes activated
net <- net %>% activate("nodes")
n1 <- st_nearest_feature(pts[1,], net)
n2 <- st_nearest_feature(pts[2,], net)
node1; node2
#> [1] 1560
#> [1] 3938
# locate those on a map:
net %>%
st_as_sf() %>%
slice(c(n1, n2)) %>%
mapview(layer.name = "net nodes") +
mapview(pts)
<!-- -->
# as sfnetwork inherits from tbl_graph and igraph:
class(net)
#> [1] "sfnetwork" "tbl_graph" "igraph"
# we can use igraph::shortest_paths() with our sfnetwork object:
paths_ig <- shortest_paths(net, n1, n2, output = "epath")
# paths_ig$epath is a list of edges:
str(paths_ig)
#> List of 4
#> $ vpath : NULL
#> $ epath :List of 1
#> ..$ : 'igraph.es' int [1:53] 1027 1032 1295 697 121 2704 2702 2778 1125 1280 ...
#> .. ..- attr(*, "env")=<weakref>
#> .. ..- attr(*, "graph")= chr "aac7eae8-1b4f-11ee-a8b6-53eec7eb8cd4"
#> $ predecessors : NULL
#> $ inbound_edges: NULL
# when converted to integer, we'll have the same index list as
# from sfnetwork::st_network_paths() :
e <- paths_ig$epath[[1]] %>% as.integer()
# check if those are indeed identical:
identical(e, paths_sfn$edge_paths[[1]])
#> [1] TRUE
# sum edge wights to get path distance:
sum(E(net)[e]$weight)
#> 7505.43 [m]
# we have not altered our original sf object (a) nor
# sfnetwork object (net), so indeces should still match and we can use
# e for subsetting: a[e,]
ggplot() +
geom_sf(data = a, color = "grey80") +
geom_sf(data = a[e,], linewidth = 1.5, alpha = .6) +
geom_sf(data = pts, aes(color = point), size = 3) +
coord_sf(datum = st_crs(a), ylim = c(922000, 930000)) +
theme_light()
<!-- -->
Morphers by tidygraph
and sfnetworks
There are also morphers, i.e. tidygraph::to_shortest_path
(would work with previous n1
and n2
objects for from
/ to
) and sfnetworks::to_spatial_shortest_paths
(takes sf
points for from
/ to
, just like st_network_paths()
) , so there's no need to worry about picking relevant nodes / edges from the network, convert()
output is already filtered:
library(tidygraph)
converted_net <- net %>% convert(to_spatial_shortest_paths, pts[1,], pts[2,])
route_edges <- st_as_sf(converted_net, "edges")
route_nodes <- st_as_sf(converted_net, "nodes")
# route length:
sum(route_edges$weight)
#> 7505.43 [m]
ggplot() +
geom_sf(data = a, color = "grey80") +
geom_sf(data = route_edges) +
geom_sf(data = route_nodes, alpha = .2) +
coord_sf(datum = st_crs(a), ylim = c(922000, 930000)) +
theme_light()
<!-- -->
<sup>Created on 2023-07-05 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论