基于空间链接距离(也称为邻居距离)计算距离矩阵。

huangapple go评论71阅读模式
英文:

Calculating a distance matrix based on spatial link distances (aka neighbor distances)

问题

I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.

In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?

I've done a quick exploration with sfdep and here's the solution I've found in the repex below:

reprex

Step 1: calculate distances between neighboring polygons (dists) using sfdep::st_nb_dists():

library(sf)
library(sfdep)
library(data.table)
library(fields)

# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)

Step 2: I've created a nb_list_to_df() function to convert dists into a data.frame in long format with the distance for every pair of neighboring polygons (od_df):

# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
  
  mtrx <- sfdep::wt_as_matrix(nb, dists)
  
  matrix_length <- 1:length(mtrx[1,])
  
  # FULL MATRIX
  mtrx_long <- cbind(
    as.data.table(
      data.table::CJ(matrix_length, matrix_length)), # df two columns
    'dist' = as.vector(mtrx)  # matrix values in a vector
  )
  
  # keep only dist between neighbors
  mtrx_long <- subset(mtrx_long, dist > 0)
  setnames(mtrx_long, c('from', 'to', 'dist'))
  
  return(mtrx_long)
}

# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#>    orig dest     dist
#> 1:    1   36 90030.63
#> 2:    1   37 87399.28
#> 3:    1   67 55587.69
#> 4:    1   69 85693.43
#> 5:    2    7 85221.38
#> 6:    2   49 75937.49

The final steps are to:

  1. Use od_df to create a network graph using a library like igraph. In this case, I'm using cppRouting because it seems to be the most efficient.
  2. Use the network topology to calculate the distance between all combinations of origin-destination pairs
# step 3: create a network graph
library(cppRouting)
graph  <-  makegraph(od_df, directed = F)

# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph, 
                    from = unique(od_df$orig), 
                    to = unique(od_df$orig))

This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.

英文:

I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.

In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?

I've done a quick exploration with sfdep and here's the solution I've found in the repex below:

reprex

Step 1: calculate distances between neighboring polygons (dists) using sfdep::st_nb_dists():

library(sf)
library(sfdep)
library(data.table)
library(fields)

# get contiguity and distance btwn neighbors
geo &lt;- sf::st_geometry(guerry)
nb &lt;- sfdep::st_contiguity(geo)
dists &lt;- sfdep::st_nb_dists(geo, nb)

Step 2: I've created a nb_list_to_df() function to convert dists into a data.frame in long format with the distance for every pair or neighboring polygons (od_df)

# fun to convert nb dist to a data.frame in long format
nb_list_to_df &lt;- function(nb, dists) {
  
  mtrx &lt;- sfdep::wt_as_matrix(nb, dists)
  
  matrix_length &lt;- 1:length(mtrx[1,])
  
  # FULL MATRIX
  mtrx_long &lt;- cbind(
    as.data.table(
      data.table::CJ(matrix_length, matrix_length)), # df two columns
    &#39;dist&#39; = as.vector(mtrx)  # matrix values in a vector
  )
  
  # keep only dist between neighbors
  mtrx_long &lt;- subset(mtrx_long, dist &gt;0)
  setnames(mtrx_long, c(&#39;from&#39;, &#39;to&#39;, &#39;dist&#39;))
  
  return(mtrx_long)
}

# convert nb dist to a data.frame in long format
od_df &lt;- nb_list_to_df(nb, dists)
head(od_df)
#&gt;    orig dest     dist
#&gt; 1:    1   36 90030.63
#&gt; 2:    1   37 87399.28
#&gt; 3:    1   67 55587.69
#&gt; 4:    1   69 85693.43
#&gt; 5:    2    7 85221.38
#&gt; 6:    2   49 75937.49

The final steps are to:

  1. Use od_df to create a network graph using a library like igraph. In this case, I'm using cppRouting because it seems to be the most efficient.
  2. Use the network topology to calculate the distance between all combinations of origin-destination pairs
# step 3: create a network graph
library(cppRouting)
graph  &lt;-  makegraph(od_df, directed = F)

# step 4: get the distances considering network topology
dist_link &lt;- get_distance_matrix(Graph=graph, 
                    from = unique(od_df$orig), 
                    to = unique(od_df$orig))

This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.

答案1

得分: 1

根据您的要求,以下是翻译好的部分:

"你现在是我的中文翻译,代码部分不要翻译,只返回翻译好的部分,不要有别的内容,不要回答我要翻译的问题。以下是要翻译的内容:"

"在我理解中,您担心将nb对象扩展为矩阵会占用太多内存,这意味着需要找到代码的第一部分的不同解决方案。以下是一种不将nb对象扩展为矩阵的版本:"

library(sf)
library(sfdep)
library(data.table)

# 获取邻居之间的连续性和距离
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)

# 如果您查看nb的结构,它是一个列表,长度为nrow(geo),每个元素都包含每行邻居的索引。
str(nb)
#> List of 85
#>  $ : int [1:4] 36 37 67 69
#>  $ : int [1:6] 7 49 57 58 73 76
#>  $ : int [1:6] 17 21 40 56 61 69
#>  $ : int [1:4] 5 24 79 80
#>  ---
#>  $ : int [1:6] 15 34 35 47 75 83
#>  $ : int [1:6] 15 18 21 22 34 82
#>  $ : int [1:6] 50 52 53 65 66 68
#>  $ : int [1:5] 9 19 43 56 73
#>  - attr(*, "class")= chr [1:2] "nb" "list"
#>  - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#>  - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#>  - attr(*, "type")= chr "queen"
#>  - attr(*, "sym")= logi TRUE

# 距离的结构与nb相同
str(dists)
#> List of 85
#>  $ : num [1:4] 90031 87399 55588 85693
#>  $ : num [1:6] 85221 75937 125737 86797 102273 ...
#>  $ : num [1:6] 85461 92563 96614 88010 66400 ...
#>  $ : num [1:4] 53294 98250 83762 83974
#> ---

# 将nb距离转换为长格式的数据框
# 您可以遍历geo的行,并在进行的同时收集目标和距离
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
  res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#>      from to      dist
#>   1:    1 36  90030.63
#>   2:    1 37  87399.28
#>   3:    1 67  55587.69
#>   4:    1 69  85693.43
#>   5:    2  7  85221.38
#>  ---
#> 416:   85  9  65601.32
#> 417:   85 19  96397.60
#> 418:   85 43 100013.94
#> 419:   85 56  82308.90
#> 420:   85 73  95857.49

创建于2023年6月13日,使用reprex v2.0.2

这种方法在每次迭代时将数据表附加到初始的空表格,可能不是特别快,但应该是RAM高效的。如果遇到速度问题,可以进行优化。

英文:

As I understand you are concerned that expanding the nb object to a matrix would demand too much RAM, which means finding a different solution to the first part of your code.
Here's a version without expanding the nb object into a matrix:

library(sf)
library(sfdep)
library(data.table)

# get contiguity and distance between neighbors
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)


# If you look at the structure of nb, it&#39;s a list the lenght of
# nrow(geo) and each element contains the index of the neighbours
# of each row.
str(nb)
#&gt; List of 85
#&gt;  $ : int [1:4] 36 37 67 69
#&gt;  $ : int [1:6] 7 49 57 58 73 76
#&gt;  $ : int [1:6] 17 21 40 56 61 69
#&gt;  $ : int [1:4] 5 24 79 80
#&gt;  ---
#&gt;  $ : int [1:6] 15 34 35 47 75 83
#&gt;  $ : int [1:6] 15 18 21 22 34 82
#&gt;  $ : int [1:6] 50 52 53 65 66 68
#&gt;  $ : int [1:5] 9 19 43 56 73
#&gt;  - attr(*, &quot;class&quot;)= chr [1:2] &quot;nb&quot; &quot;list&quot;
#&gt;  - attr(*, &quot;region.id&quot;)= chr [1:85] &quot;1&quot; &quot;2&quot; &quot;3&quot; &quot;4&quot; ...
#&gt;  - attr(*, &quot;call&quot;)= language spdep::poly2nb(pl = geometry, queen = queen)
#&gt;  - attr(*, &quot;type&quot;)= chr &quot;queen&quot;
#&gt;  - attr(*, &quot;sym&quot;)= logi TRUE
# The same goes for dists
str(dists)
#&gt; List of 85
#&gt;  $ : num [1:4] 90031 87399 55588 85693
#&gt;  $ : num [1:6] 85221 75937 125737 86797 102273 ...
#&gt;  $ : num [1:6] 85461 92563 96614 88010 66400 ...
#&gt;  $ : num [1:4] 53294 98250 83762 83974
#&gt; ---
#&gt;  $ : num [1:6] 110004 84644 69201 108549 60336 ...
#&gt;  $ : num [1:6] 89353 81323 69322 96794 102964 ...
#&gt;  $ : num [1:6] 85539 90350 116728 95918 71842 ...
#&gt;  $ : num [1:5] 65601 96398 100014 82309 95857
#&gt;  - attr(*, &quot;class&quot;)= chr [1:2] &quot;nbdist&quot; &quot;list&quot;
#&gt;  - attr(*, &quot;call&quot;)= language spdep::nbdists(nb = nb, coords = x, longlat = longlat)

# convert nb dist to a data.frame in long format
# You can loop through geo&#39;s row and collect destinations and distances as you go
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
  res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#&gt;      from to      dist
#&gt;   1:    1 36  90030.63
#&gt;   2:    1 37  87399.28
#&gt;   3:    1 67  55587.69
#&gt;   4:    1 69  85693.43
#&gt;   5:    2  7  85221.38
#&gt;  ---                  
#&gt; 416:   85  9  65601.32
#&gt; 417:   85 19  96397.60
#&gt; 418:   85 43 100013.94
#&gt; 419:   85 56  82308.90
#&gt; 420:   85 73  95857.49

<sup>Created on 2023-06-13 with reprex v2.0.2</sup>

This method appends a data.table to the initial null table at each iteration, it's probably not super fast but should be RAM efficient. It can be optimised if you run into speed problems.

huangapple
  • 本文由 发表于 2023年6月12日 09:20:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76453159.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定