英文:
Calculating a distance matrix based on spatial link distances (aka neighbor distances)
问题
I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.
In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?
I've done a quick exploration with sfdep and here's the solution I've found in the repex below:
reprex
Step 1: calculate distances between neighboring polygons (dists) using sfdep::st_nb_dists():
library(sf)
library(sfdep)
library(data.table)
library(fields)
# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)
Step 2: I've created a nb_list_to_df() function to convert dists into a data.frame in long format with the distance for every pair of neighboring polygons (od_df):
# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
  
  mtrx <- sfdep::wt_as_matrix(nb, dists)
  
  matrix_length <- 1:length(mtrx[1,])
  
  # FULL MATRIX
  mtrx_long <- cbind(
    as.data.table(
      data.table::CJ(matrix_length, matrix_length)), # df two columns
    'dist' = as.vector(mtrx)  # matrix values in a vector
  )
  
  # keep only dist between neighbors
  mtrx_long <- subset(mtrx_long, dist > 0)
  setnames(mtrx_long, c('from', 'to', 'dist'))
  
  return(mtrx_long)
}
# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#>    orig dest     dist
#> 1:    1   36 90030.63
#> 2:    1   37 87399.28
#> 3:    1   67 55587.69
#> 4:    1   69 85693.43
#> 5:    2    7 85221.38
#> 6:    2   49 75937.49
The final steps are to:
- Use 
od_dfto create a network graph using a library likeigraph. In this case, I'm usingcppRoutingbecause it seems to be the most efficient. - Use the network topology to calculate the distance between all combinations of origin-destination pairs
 
# step 3: create a network graph
library(cppRouting)
graph  <-  makegraph(od_df, directed = F)
# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph, 
                    from = unique(od_df$orig), 
                    to = unique(od_df$orig))
This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.
英文:
I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.
In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?
I've done a quick exploration with sfdep and here's the solution I've found in the repex below:
reprex
Step 1: calculate distances between neighboring polygons (dists) using sfdep::st_nb_dists():
library(sf)
library(sfdep)
library(data.table)
library(fields)
# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)
Step 2: I've created a nb_list_to_df() function to convert dists into a data.frame in long format with the distance for every pair or neighboring polygons (od_df)
# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
  
  mtrx <- sfdep::wt_as_matrix(nb, dists)
  
  matrix_length <- 1:length(mtrx[1,])
  
  # FULL MATRIX
  mtrx_long <- cbind(
    as.data.table(
      data.table::CJ(matrix_length, matrix_length)), # df two columns
    'dist' = as.vector(mtrx)  # matrix values in a vector
  )
  
  # keep only dist between neighbors
  mtrx_long <- subset(mtrx_long, dist >0)
  setnames(mtrx_long, c('from', 'to', 'dist'))
  
  return(mtrx_long)
}
# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#>    orig dest     dist
#> 1:    1   36 90030.63
#> 2:    1   37 87399.28
#> 3:    1   67 55587.69
#> 4:    1   69 85693.43
#> 5:    2    7 85221.38
#> 6:    2   49 75937.49
The final steps are to:
- Use 
od_dfto create a network graph using a library likeigraph. In this case, I'm usingcppRoutingbecause it seems to be the most efficient. - Use the network topology to calculate the distance between all combinations of origin-destination pairs
 
# step 3: create a network graph
library(cppRouting)
graph  <-  makegraph(od_df, directed = F)
# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph, 
                    from = unique(od_df$orig), 
                    to = unique(od_df$orig))
This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.
答案1
得分: 1
根据您的要求,以下是翻译好的部分:
"你现在是我的中文翻译,代码部分不要翻译,只返回翻译好的部分,不要有别的内容,不要回答我要翻译的问题。以下是要翻译的内容:"
"在我理解中,您担心将nb对象扩展为矩阵会占用太多内存,这意味着需要找到代码的第一部分的不同解决方案。以下是一种不将nb对象扩展为矩阵的版本:"
library(sf)
library(sfdep)
library(data.table)
# 获取邻居之间的连续性和距离
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)
# 如果您查看nb的结构,它是一个列表,长度为nrow(geo),每个元素都包含每行邻居的索引。
str(nb)
#> List of 85
#>  $ : int [1:4] 36 37 67 69
#>  $ : int [1:6] 7 49 57 58 73 76
#>  $ : int [1:6] 17 21 40 56 61 69
#>  $ : int [1:4] 5 24 79 80
#>  ---
#>  $ : int [1:6] 15 34 35 47 75 83
#>  $ : int [1:6] 15 18 21 22 34 82
#>  $ : int [1:6] 50 52 53 65 66 68
#>  $ : int [1:5] 9 19 43 56 73
#>  - attr(*, "class")= chr [1:2] "nb" "list"
#>  - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#>  - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#>  - attr(*, "type")= chr "queen"
#>  - attr(*, "sym")= logi TRUE
# 距离的结构与nb相同
str(dists)
#> List of 85
#>  $ : num [1:4] 90031 87399 55588 85693
#>  $ : num [1:6] 85221 75937 125737 86797 102273 ...
#>  $ : num [1:6] 85461 92563 96614 88010 66400 ...
#>  $ : num [1:4] 53294 98250 83762 83974
#> ---
# 将nb距离转换为长格式的数据框
# 您可以遍历geo的行,并在进行的同时收集目标和距离
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
  res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#>      from to      dist
#>   1:    1 36  90030.63
#>   2:    1 37  87399.28
#>   3:    1 67  55587.69
#>   4:    1 69  85693.43
#>   5:    2  7  85221.38
#>  ---
#> 416:   85  9  65601.32
#> 417:   85 19  96397.60
#> 418:   85 43 100013.94
#> 419:   85 56  82308.90
#> 420:   85 73  95857.49
创建于2023年6月13日,使用reprex v2.0.2
这种方法在每次迭代时将数据表附加到初始的空表格,可能不是特别快,但应该是RAM高效的。如果遇到速度问题,可以进行优化。
英文:
As I understand you are concerned that expanding the nb object to a matrix would demand too much RAM, which means finding a different solution to the first part of your code.
Here's a version without expanding the nb object into a matrix:
library(sf)
library(sfdep)
library(data.table)
# get contiguity and distance between neighbors
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)
# If you look at the structure of nb, it's a list the lenght of
# nrow(geo) and each element contains the index of the neighbours
# of each row.
str(nb)
#> List of 85
#>  $ : int [1:4] 36 37 67 69
#>  $ : int [1:6] 7 49 57 58 73 76
#>  $ : int [1:6] 17 21 40 56 61 69
#>  $ : int [1:4] 5 24 79 80
#>  ---
#>  $ : int [1:6] 15 34 35 47 75 83
#>  $ : int [1:6] 15 18 21 22 34 82
#>  $ : int [1:6] 50 52 53 65 66 68
#>  $ : int [1:5] 9 19 43 56 73
#>  - attr(*, "class")= chr [1:2] "nb" "list"
#>  - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#>  - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#>  - attr(*, "type")= chr "queen"
#>  - attr(*, "sym")= logi TRUE
# The same goes for dists
str(dists)
#> List of 85
#>  $ : num [1:4] 90031 87399 55588 85693
#>  $ : num [1:6] 85221 75937 125737 86797 102273 ...
#>  $ : num [1:6] 85461 92563 96614 88010 66400 ...
#>  $ : num [1:4] 53294 98250 83762 83974
#> ---
#>  $ : num [1:6] 110004 84644 69201 108549 60336 ...
#>  $ : num [1:6] 89353 81323 69322 96794 102964 ...
#>  $ : num [1:6] 85539 90350 116728 95918 71842 ...
#>  $ : num [1:5] 65601 96398 100014 82309 95857
#>  - attr(*, "class")= chr [1:2] "nbdist" "list"
#>  - attr(*, "call")= language spdep::nbdists(nb = nb, coords = x, longlat = longlat)
# convert nb dist to a data.frame in long format
# You can loop through geo's row and collect destinations and distances as you go
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
  res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#>      from to      dist
#>   1:    1 36  90030.63
#>   2:    1 37  87399.28
#>   3:    1 67  55587.69
#>   4:    1 69  85693.43
#>   5:    2  7  85221.38
#>  ---                  
#> 416:   85  9  65601.32
#> 417:   85 19  96397.60
#> 418:   85 43 100013.94
#> 419:   85 56  82308.90
#> 420:   85 73  95857.49
<sup>Created on 2023-06-13 with reprex v2.0.2</sup>
This method appends a data.table to the initial null table at each iteration, it's probably not super fast but should be RAM efficient. It can be optimised if you run into speed problems.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论