英文:
Calculating a distance matrix based on spatial link distances (aka neighbor distances)
问题
I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.
In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?
I've done a quick exploration with sfdep and here's the solution I've found in the repex below:
reprex
Step 1: calculate distances between neighboring polygons (dists
) using sfdep::st_nb_dists()
:
library(sf)
library(sfdep)
library(data.table)
library(fields)
# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)
Step 2: I've created a nb_list_to_df()
function to convert dists
into a data.frame
in long format with the distance for every pair of neighboring polygons (od_df
):
# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
mtrx <- sfdep::wt_as_matrix(nb, dists)
matrix_length <- 1:length(mtrx[1,])
# FULL MATRIX
mtrx_long <- cbind(
as.data.table(
data.table::CJ(matrix_length, matrix_length)), # df two columns
'dist' = as.vector(mtrx) # matrix values in a vector
)
# keep only dist between neighbors
mtrx_long <- subset(mtrx_long, dist > 0)
setnames(mtrx_long, c('from', 'to', 'dist'))
return(mtrx_long)
}
# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#> orig dest dist
#> 1: 1 36 90030.63
#> 2: 1 37 87399.28
#> 3: 1 67 55587.69
#> 4: 1 69 85693.43
#> 5: 2 7 85221.38
#> 6: 2 49 75937.49
The final steps are to:
- Use
od_df
to create a network graph using a library likeigraph
. In this case, I'm usingcppRouting
because it seems to be the most efficient. - Use the network topology to calculate the distance between all combinations of origin-destination pairs
# step 3: create a network graph
library(cppRouting)
graph <- makegraph(od_df, directed = F)
# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph,
from = unique(od_df$orig),
to = unique(od_df$orig))
This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.
英文:
I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.
In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?
I've done a quick exploration with sfdep and here's the solution I've found in the repex below:
reprex
Step 1: calculate distances between neighboring polygons (dists
) using sfdep::st_nb_dists()
:
library(sf)
library(sfdep)
library(data.table)
library(fields)
# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)
Step 2: I've created a nb_list_to_df()
function to convert dists
into a data.frame
in long format with the distance for every pair or neighboring polygons (od_df
)
# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
mtrx <- sfdep::wt_as_matrix(nb, dists)
matrix_length <- 1:length(mtrx[1,])
# FULL MATRIX
mtrx_long <- cbind(
as.data.table(
data.table::CJ(matrix_length, matrix_length)), # df two columns
'dist' = as.vector(mtrx) # matrix values in a vector
)
# keep only dist between neighbors
mtrx_long <- subset(mtrx_long, dist >0)
setnames(mtrx_long, c('from', 'to', 'dist'))
return(mtrx_long)
}
# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#> orig dest dist
#> 1: 1 36 90030.63
#> 2: 1 37 87399.28
#> 3: 1 67 55587.69
#> 4: 1 69 85693.43
#> 5: 2 7 85221.38
#> 6: 2 49 75937.49
The final steps are to:
- Use
od_df
to create a network graph using a library likeigraph
. In this case, I'm usingcppRouting
because it seems to be the most efficient. - Use the network topology to calculate the distance between all combinations of origin-destination pairs
# step 3: create a network graph
library(cppRouting)
graph <- makegraph(od_df, directed = F)
# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph,
from = unique(od_df$orig),
to = unique(od_df$orig))
This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.
答案1
得分: 1
根据您的要求,以下是翻译好的部分:
"你现在是我的中文翻译,代码部分不要翻译,只返回翻译好的部分,不要有别的内容,不要回答我要翻译的问题。以下是要翻译的内容:"
"在我理解中,您担心将nb
对象扩展为矩阵会占用太多内存,这意味着需要找到代码的第一部分的不同解决方案。以下是一种不将nb
对象扩展为矩阵的版本:"
library(sf)
library(sfdep)
library(data.table)
# 获取邻居之间的连续性和距离
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)
# 如果您查看nb的结构,它是一个列表,长度为nrow(geo),每个元素都包含每行邻居的索引。
str(nb)
#> List of 85
#> $ : int [1:4] 36 37 67 69
#> $ : int [1:6] 7 49 57 58 73 76
#> $ : int [1:6] 17 21 40 56 61 69
#> $ : int [1:4] 5 24 79 80
#> ---
#> $ : int [1:6] 15 34 35 47 75 83
#> $ : int [1:6] 15 18 21 22 34 82
#> $ : int [1:6] 50 52 53 65 66 68
#> $ : int [1:5] 9 19 43 56 73
#> - attr(*, "class")= chr [1:2] "nb" "list"
#> - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#> - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#> - attr(*, "type")= chr "queen"
#> - attr(*, "sym")= logi TRUE
# 距离的结构与nb相同
str(dists)
#> List of 85
#> $ : num [1:4] 90031 87399 55588 85693
#> $ : num [1:6] 85221 75937 125737 86797 102273 ...
#> $ : num [1:6] 85461 92563 96614 88010 66400 ...
#> $ : num [1:4] 53294 98250 83762 83974
#> ---
# 将nb距离转换为长格式的数据框
# 您可以遍历geo的行,并在进行的同时收集目标和距离
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#> from to dist
#> 1: 1 36 90030.63
#> 2: 1 37 87399.28
#> 3: 1 67 55587.69
#> 4: 1 69 85693.43
#> 5: 2 7 85221.38
#> ---
#> 416: 85 9 65601.32
#> 417: 85 19 96397.60
#> 418: 85 43 100013.94
#> 419: 85 56 82308.90
#> 420: 85 73 95857.49
创建于2023年6月13日,使用reprex v2.0.2
这种方法在每次迭代时将数据表附加到初始的空表格,可能不是特别快,但应该是RAM高效的。如果遇到速度问题,可以进行优化。
英文:
As I understand you are concerned that expanding the nb
object to a matrix would demand too much RAM, which means finding a different solution to the first part of your code.
Here's a version without expanding the nb
object into a matrix:
library(sf)
library(sfdep)
library(data.table)
# get contiguity and distance between neighbors
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)
# If you look at the structure of nb, it's a list the lenght of
# nrow(geo) and each element contains the index of the neighbours
# of each row.
str(nb)
#> List of 85
#> $ : int [1:4] 36 37 67 69
#> $ : int [1:6] 7 49 57 58 73 76
#> $ : int [1:6] 17 21 40 56 61 69
#> $ : int [1:4] 5 24 79 80
#> ---
#> $ : int [1:6] 15 34 35 47 75 83
#> $ : int [1:6] 15 18 21 22 34 82
#> $ : int [1:6] 50 52 53 65 66 68
#> $ : int [1:5] 9 19 43 56 73
#> - attr(*, "class")= chr [1:2] "nb" "list"
#> - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#> - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#> - attr(*, "type")= chr "queen"
#> - attr(*, "sym")= logi TRUE
# The same goes for dists
str(dists)
#> List of 85
#> $ : num [1:4] 90031 87399 55588 85693
#> $ : num [1:6] 85221 75937 125737 86797 102273 ...
#> $ : num [1:6] 85461 92563 96614 88010 66400 ...
#> $ : num [1:4] 53294 98250 83762 83974
#> ---
#> $ : num [1:6] 110004 84644 69201 108549 60336 ...
#> $ : num [1:6] 89353 81323 69322 96794 102964 ...
#> $ : num [1:6] 85539 90350 116728 95918 71842 ...
#> $ : num [1:5] 65601 96398 100014 82309 95857
#> - attr(*, "class")= chr [1:2] "nbdist" "list"
#> - attr(*, "call")= language spdep::nbdists(nb = nb, coords = x, longlat = longlat)
# convert nb dist to a data.frame in long format
# You can loop through geo's row and collect destinations and distances as you go
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#> from to dist
#> 1: 1 36 90030.63
#> 2: 1 37 87399.28
#> 3: 1 67 55587.69
#> 4: 1 69 85693.43
#> 5: 2 7 85221.38
#> ---
#> 416: 85 9 65601.32
#> 417: 85 19 96397.60
#> 418: 85 43 100013.94
#> 419: 85 56 82308.90
#> 420: 85 73 95857.49
<sup>Created on 2023-06-13 with reprex v2.0.2</sup>
This method appends a data.table to the initial null table at each iteration, it's probably not super fast but should be RAM efficient. It can be optimised if you run into speed problems.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论