如何根据R中坐标之间的距离将数据点分组在一起?

huangapple go评论114阅读模式
英文:

How to group data points together based on distance between coordinates in R?

问题

I have a set of data where each sample taken has coordinates associated with it. I also have regulatory language that states that any samples taken within 200 meters of each other are representative of the same site.

My data set contains 159 unique coordinates that I am hoping will translate to fewer sites, which I can then analyze together.

The coordinate matrix looks like this:

  1. > as.matrix(unique(raw_data[, c("lat","long")]))
  2. lat long
  3. [1,] 25.75038 -80.96646
  4. [2,] 25.78788 -81.09702
  5. [3,] 25.75816 -80.99451
  6. [4,] 25.85593 -80.89979
  7. [5,] 25.93371 -80.81229
  8. [6,] 25.95037 -80.83312
  9. [7,] 25.86704 -81.09979
  10. [8,] 25.84482 -80.93035
  11. [9,] 25.83371 -80.88312
  12. [10,] 25.87538 -81.22480
  13. ...

I have already mapped this out using leaflet and know that there are several samples taken very close together, but I am hoping there is a way to test if any coordinates are 200 meters apart or closer from any other coordinates and assign a group to them as a third column in the matrix (sites 1-xx). From there I could use dplyr's group_by function to summarize the 6,000+ samples by site instead of coordinate pairs to get an analysis with far fewer groups of data.

英文:

I have a set of data where each sample taken has coordinates associated with it. I also have regulatory language that states that any samples taken within 200 meters of each other are representative of the same site.

My data set contains 159 unique coordinates that I am hoping will translate to fewer sites, which I can then analyze together.

The coordinate matrix looks like this:

  1. > as.matrix(unique(raw_data[, c("lat","long")]))
  2. lat long
  3. [1,] 25.75038 -80.96646
  4. [2,] 25.78788 -81.09702
  5. [3,] 25.75816 -80.99451
  6. [4,] 25.85593 -80.89979
  7. [5,] 25.93371 -80.81229
  8. [6,] 25.95037 -80.83312
  9. [7,] 25.86704 -81.09979
  10. [8,] 25.84482 -80.93035
  11. [9,] 25.83371 -80.88312
  12. [10,] 25.87538 -81.22480
  13. [11,] 25.88729 -81.26125
  14. [12,] 25.87676 -81.22787
  15. [13,] 26.16820 -81.08820
  16. [14,] 25.74760 -80.94979
  17. [15,] 25.79030 -80.89110
  18. [16,] 25.77390 -80.93390
  19. [17,] 25.88780 -81.26170
  20. [18,] 25.87664 -81.22823
  21. [19,] 26.22222 -81.17222
  22. [20,] 25.88764 -81.26188
  23. [21,] 25.89092 -81.26972
  24. [22,] 25.99452 -81.26270
  25. [23,] 26.19736 -81.26716
  26. [24,] 25.90036 -81.26199
  27. [25,] 26.17218 -81.26681
  28. [26,] 26.09577 -81.26506
  29. [27,] 26.16925 -81.08729
  30. [28,] 25.77806 -80.84444
  31. [29,] 25.88778 -81.26167
  32. [30,] 25.87639 -81.21778
  33. [31,] 25.85190 -80.98100
  34. [32,] 25.85192 -80.98103
  35. [33,] 25.85222 -80.98083
  36. [34,] 25.87222 -81.01861
  37. [35,] 25.86385 -81.10096
  38. [36,] 25.86361 -81.10111
  39. [37,] 25.84341 -80.91720
  40. [38,] 25.84306 -80.91778
  41. [39,] 25.89030 -81.27030
  42. [40,] 25.89028 -81.27025
  43. [41,] 25.89056 -81.27000
  44. [42,] 25.88650 -81.26210
  45. [43,] 25.88653 -81.26208
  46. [44,] 25.88694 -81.26194
  47. [45,] 25.76190 -80.85330
  48. [46,] 25.78544 -80.85119
  49. [47,] 26.04472 -81.29972
  50. [48,] 26.04430 -81.29990
  51. [49,] 26.04431 -81.29992
  52. [50,] 26.09280 -81.05390
  53. [51,] 26.09278 -81.05392
  54. [52,] 26.16000 -81.22639
  55. [53,] 26.19600 -81.28870
  56. [54,] 26.19597 -81.28869
  57. [55,] 26.19639 -81.28861
  58. [56,] 26.16250 -81.24170
  59. [57,] 26.16390 -81.17360
  60. [58,] 26.16530 -81.09310
  61. [59,] 26.09440 -81.26670
  62. [60,] 25.78890 -80.85690
  63. [61,] 25.85280 -81.02920
  64. [62,] 25.84440 -80.97080
  65. [63,] 25.87360 -81.22920
  66. [64,] 25.94720 -81.26250
  67. [65,] 25.91917 -80.83639
  68. [66,] 25.84310 -80.91770
  69. [67,] 25.78611 -81.20056
  70. [68,] 25.71350 -81.02190
  71. [69,] 25.71347 -81.02192
  72. [70,] 25.71389 -81.02167
  73. [71,] 25.72080 -80.87220
  74. [72,] 25.76390 -81.07500
  75. [73,] 26.19130 -81.08680
  76. [74,] 26.19128 -81.08675
  77. [75,] 26.19167 -81.08667
  78. [76,] 25.68722 -80.91972
  79. [77,] 26.20500 -81.16833
  80. [78,] 25.78861 -81.09991
  81. [79,] 25.76027 -81.04831
  82. [80,] 25.76050 -80.99626
  83. [81,] 25.75065 -80.96644
  84. [82,] 25.76126 -80.90782
  85. [83,] 25.74659 -80.95390
  86. [84,] 25.81790 -81.10038
  87. [85,] 25.86360 -81.10120
  88. [86,] 25.95730 -81.10380
  89. [87,] 25.95725 -81.10383
  90. [88,] 25.95750 -81.10361
  91. [89,] 26.15640 -81.22190
  92. [90,] 26.15644 -81.22192
  93. [91,] 26.15694 -81.22167
  94. [92,] 26.15560 -81.26650
  95. [93,] 25.78371 -81.19146
  96. [94,] 25.77861 -80.91194
  97. [95,] 25.78389 -80.92528
  98. [96,] 25.77820 -80.91220
  99. [97,] 25.77822 -80.91222
  100. [98,] 25.96840 -80.92640
  101. [99,] 25.96839 -80.92636
  102. [100,] 25.96889 -80.92611
  103. [101,] 25.78944 -81.10000
  104. [102,] 25.78510 -81.08313
  105. [103,] 25.78910 -81.10010
  106. [104,] 25.78908 -81.10011
  107. [105,] 25.78760 -81.09896
  108. [106,] 25.87777 -81.23770
  109. [107,] 25.84714 -80.93604
  110. [108,] 25.82152 -80.89180
  111. [109,] 25.83346 -80.84754
  112. [110,] 25.90154 -81.31661
  113. [111,] 26.08724 -81.26474
  114. [112,] 25.93417 -80.83276
  115. [113,] 25.86557 -80.84374
  116. [114,] 25.85072 -80.97178
  117. [115,] 25.85074 -80.97178
  118. [116,] 26.16790 -81.16497
  119. [117,] 25.86968 -81.15835
  120. [118,] 26.04585 -81.26362
  121. [119,] 26.16682 -81.22861
  122. [120,] 25.76117 -80.88047
  123. [121,] 25.75742 -80.98732
  124. [122,] 26.15562 -81.29818
  125. [123,] 25.98263 -81.26223
  126. [124,] 25.79000 -80.87640
  127. [125,] 25.77560 -80.90440
  128. [126,] 25.86361 -81.10117
  129. [127,] 25.78857 -81.09992
  130. [128,] 25.82480 -80.89610
  131. [129,] 25.87244 -81.18669
  132. [130,] 25.76427 -80.83034
  133. [131,] 25.90145 -81.32419
  134. [132,] 25.80220 -80.86970
  135. [133,] 25.83288 -80.90423
  136. [134,] 25.85157 -80.98093
  137. [135,] 25.82538 -80.89562
  138. [136,] 25.86427 -81.09979
  139. [137,] 25.80566 -80.87312
  140. [138,] 25.90361 -81.31417
  141. [139,] 25.78816 -80.85507
  142. [140,] 25.90060 -81.30420
  143. [141,] 25.89121 -81.27008
  144. [142,] 25.85177 -80.98035
  145. [143,] 25.88772 -81.26163
  146. [144,] 25.84954 -80.95590
  147. [145,] 25.89080 -81.26980
  148. [146,] 25.89080 -81.26981
  149. [147,] 26.16564 -81.24702
  150. [148,] 25.89843 -81.26480
  151. [149,] 25.82481 -80.89611
  152. [150,] 25.87639 -81.22811
  153. [151,] 25.85185 -80.98065
  154. [152,] 26.05667 -81.15583
  155. [153,] 25.86503 -80.84377
  156. [154,] 25.79065 -80.85630
  157. [155,] 25.87581 -81.21880
  158. [156,] 25.87585 -81.21883
  159. [157,] 25.80406 -80.85364
  160. [158,] 25.89177 -81.26959
  161. [159,] 25.89185 -81.26954

I have already mapped this out using leaflet and know that there are several samples taken very close together, but I am hoping there is a way to test if any coordinates are 200 meters apart or closer from any other coordinates and assign a group to them as a third column in the matrix (sites 1-xx). From there I could use dplyr's group_by function to summarize the 6,000+ samples by site instead of coordinate pairs to get an analysis with far fewer groups of data.

Edit

To answer some questions:

If sample A is <200m from sample B, which is <200m from sample C, while A is >200m from C, they should all be put into one group.

I have been able to use geodist to make a distance matrix of all sites, including self-matches, but I am unsure how to use that matrix to assign a new column value to all isolated samples and grouped samples.

This is what I have now:

  1. map.dat = unique(raw_data[, c(&quot;station name&quot;,&quot;lat&quot;,&quot;long&quot;)])
  2. geo.dist = geodist(map.dat)
  3. colnames(geo.dist) = as.vector(map.dat$`station name`)
  4. rownames(geo.dist) = as.vector(map.dat$`station name`)
  1. &gt; data.frame(map.dat)
  2. station.name lat long
  3. 1 10B CULVERT 24 ON LOOP ROAD NR PINECREST F 25.75038020 -80.96645810
  4. 2 10B CYPRESS STRD OFF SR 94 NR PINECREST FL 25.78787950 -81.09701680
  5. 3 10B CYPRESS SWP DR AT SR 94 NR PINCREST FL 25.75815780 -80.99451440
  6. 4 10B CYPRESS SWP NR JETPORT BORROW PIT 3 NR 25.85593150 -80.89978890
  7. 5 10B L-28 EAST CA NR PINECREST FLA 25.93370590 -80.81228610
  8. 6 10B LAKE OKEECHOBEE AT OKEECHOBEE FLA 25.95037200 -80.83311990
  9. 7 10B TAMIAMI CA AT BR 96 AT MONROE FLA 25.86704310 -81.09979410
  10. 8 10B TAMIAMI CA AT JETPORT ENTRANCE NR MIAM 25.84482090 -80.93034540
  11. 9 10B TAMIAMI CANAL AT BR 115 NEAR MIAMI FLA 25.83371000 -80.88312190
  12. 10 10B TAMIAMI CANAL AT BRIDGE 86 NR OCHOPPE 25.87537680 -81.22479730
  13. 11 21FLSFWMBC16 25.88729000 -81.26125000
  14. 12 21FLSFWMBC17 25.87676000 -81.22787000
  15. 13 AABR265 26.16820000 -81.08820000
  16. 14 AT BR 115 COLLIER COUNTY, FLA 25.83371000 -80.88312190
  17. 15 AT BR 96 MONROE, FLA 25.86704310 -81.09979410
  18. 16 BARROW CA AT SR 94 PINECREST FLA 25.74760250 -80.94979100
  19. 17 BASIN, CONCH KEY, BAYSIDE 25.79030000 -80.89110000
  20. 18 BAY, FLORIDA, TOM@S HARBOR 25.77390000 -80.93390000
  21. 19 BC16 25.88780000 -81.26170000
  22. 20 BC17 25.87664160 -81.22823330
  23. 21 BCAP2 26.22222220 -81.17222220
  24. 22 Bear Island Loop - at Williams Wayside Prkk and Turner Riv 25.88764417 -81.26188444
  25. 23 Bear Island Loop - BCA8; on US 41 at Turner River 25.89092417 -81.26972167
  26. 24 Bear Island Loop - corner of Wagon Wheel and Turner River Rd 25.99452160 -81.26269700
  27. 25 Bear Island Loop - On Perocchi Grade Road at Et Hinson Marsh 26.19736306 -81.26715889
  28. 26 Bear Island Loop - On Turner Riv Rd @ Turner River Headwater 25.90035583 -81.26198917
  29. 27 Bear Island Loop; BR030169 On Turner River Road 26.17217970 -81.26681100
  30. 28 Bear Island Loop; On Turner River Road at Fire Prairie trail 26.09576583 -81.26506444
  31. 29 BIG CYPRESS WATERSHED EVERGLADES PARKWAY, NR. BI 26.16925350 -81.08729240
  32. 30 BIG CYPRESS WATERSHED NEAR SUNNILAND,FLA 26.16925350 -81.08729240
  33. 31 BRIDGE #25 ON U.S.41 2 MILES WEST OF S-12A 25.77805560 -80.84444440
  34. 32 Bridge #84 on US 41E 25.88777780 -81.26166670
  35. 33 Bridge #86 on US 41E 25.87638890 -81.21777780
  36. 34 BRIDGE 105 25.85190000 -80.98100000
  37. 35 BRIDGE 105 25.85191670 -80.98102780
  38. 36 BRIDGE 105 25.85222220 -80.98083330
  39. 37 Bridge 30090 on US41E 25.87222220 -81.01861110
  40. 38 Bridge 30096 at intersection of US41 and Loop Road 25.86385000 -81.10096000
  41. 39 Bridge 30096 at intersection of US41 and Loop Road 25.86361110 -81.10111110
  42. 40 Bridge 30105 on US41 East 25.84341000 -80.91720000
  43. 41 Bridge 30105 on US41 East 25.84305560 -80.91777780
  44. 42 BRIDGE 83 25.89030000 -81.27030000
  45. 43 BRIDGE 83 25.89027780 -81.27025000
  46. 44 BRIDGE 83 25.89055560 -81.27000000
  47. 45 BRIDGE 84 25.88650000 -81.26210000
  48. 46 BRIDGE 84 25.88652780 -81.26208330
  49. 47 BRIDGE 84 25.88694440 -81.26194440
  50. 48 C-4 TAMIAMI CANAL ABOVE S-12A 25.76190000 -80.85330000
  51. 49 Dad-L29-1 25.78544440 -80.85119400
  52. 50 DEEP LAKE 26.04472220 -81.29972220
  53. 51 DEEP LAKE STRAND 26.04430000 -81.29990000
  54. 52 DEEP LAKE STRAND 26.04430560 -81.29991670
  55. 53 EAST CROSSING STRAND 26.09280000 -81.05390000
  56. 54 EAST CROSSING STRAND 26.09277780 -81.05391670
  57. 55 EAST CROSSING STRAND 26.16000000 -81.22638890
  58. 56 EAST HINSON MARSH 26.19600000 -81.28870000
  59. 57 EAST HINSON MARSH 26.19597220 -81.28869440
  60. 58 EAST HINSON MARSH 26.19638890 -81.28861110
  61. 59 EV-BR. ALL ALLEY HWY 84 MI 127 26.16250000 -81.24170000
  62. 60 EV-BR. ALL ALLEY HWY 84 MI 132 26.16390000 -81.17360000
  63. 61 EV-BR. ALL ALLEY HWY 84 MI 137 26.16530000 -81.09310000
  64. 62 EV-BRIDGE OVER TURNER RIVER CNL 26.09440000 -81.26670000
  65. 63 EV-CANAL NR CONSERVATION AREA #3 25.78890000 -80.85690000
  66. 64 EV-TAMIAMI TRL-BRIDGE NO 100 25.85280000 -81.02920000
  67. 65 EV-TAMIAMI TRL-BRIDGE NO 105 25.84440000 -80.97080000
  68. 66 EV-TAMIAMI TRL-BRIDGE NO 86 25.87360000 -81.22920000
  69. 67 EV-TURNER R CNL NR HWY 840A 25.94720000 -81.26250000
  70. 68 GATED CULVERT 25.91916670 -80.83638890
  71. 69 GATOR 25.84310000 -80.91770000
  72. 70 Gator Hook Strand 25.78611110 -81.20055560
  73. 71 GUM SLOUGH 25.71350000 -81.02190000
  74. 72 GUM SLOUGH 25.71347220 -81.02191670
  75. 73 GUM SLOUGH 25.71388890 -81.02166670
  76. 74 JP-BRIDGE ON LOOP ROAD-STATE 94 25.72080000 -80.87220000
  77. 75 JP-GUM SLOUGH, LOOP ROAD 25.76390000 -81.07500000
  78. 76 KISSIMMEE BILLY STRAND 26.19130000 -81.08680000
  79. 77 KISSIMMEE BILLY STRAND 26.19127780 -81.08675000
  80. 78 KISSIMMEE BILLY STRAND 26.19166670 -81.08666670
  81. 79 Lime Tree Hammock 25.68722220 -80.91972220
  82. 80 LITTLE MARSK 26.20500000 -81.16833330
  83. 81 Loop Rd - Bridge 6 near BCA11; 5 mi S from Monroe Stat 25.78860833 -81.09991222
  84. 82 Loop Road - Bridge 29; 14.55 mi W of 40 Mile Bend 25.76027222 -81.04830667
  85. 83 Loop Road - Bridge 32N; 11.3 miles W of 40 Mile Bend 25.76049720 -80.99625600
  86. 84 Loop Road - Bridge 37; 9.3 miles W of 40 Mile Bend 25.75065278 -80.96644306
  87. 85 Loop Road - Loop 1; 5 miles W of 40 Mile bend 25.76125556 -80.90782361
  88. 86 Loop Road - Loop 2; Crooked Culvert ? Culvert 46; 3 8.3 m 25.74659167 -80.95390139
  89. 87 Loop Road - Robert Lake Strand Culvert 25.81789720 -81.10037780
  90. 88 MONROE 25.86360000 -81.10120000
  91. 89 MONUMENT ROAD 25.95730000 -81.10380000
  92. 90 MONUMENT ROAD 25.95725000 -81.10383330
  93. 91 MONUMENT ROAD 25.95750000 -81.10361110
  94. 92 MULLET SLOUGH 26.15640000 -81.22190000
  95. 93 MULLET SLOUGH 26.15644440 -81.22191670
  96. 94 MULLET SLOUGH 26.15694440 -81.22166670
  97. 95 NORTH SIDE OF ALLIGATOR ALLEY 15 MI. W.OF BROWARD 26.15560000 -81.26650000
  98. 96 P-6 GATOR HOOK STRAND AT MANGROVE FRINGE 25.78371360 -81.19146360
  99. 97 PINECREST 25.77861110 -80.91194440
  100. 98 Pinecrest Flowway 25.78388890 -80.92527780
  101. 99 PINECREST HAMMOCK 25.77820000 -80.91220000
  102. 100 PINECREST HAMMOCK 25.77822220 -80.91222220
  103. 101 RACCOON POINT 25.96840000 -80.92640000
  104. 102 RACCOON POINT 25.96838890 -80.92636110
  105. 103 RACCOON POINT 25.96888890 -80.92611110
  106. 104 ROBERTS LAKE 25.78944440 -81.10000000
  107. 105 ROBERTS LAKE SLOUGH NEAR MONROE, FLA. 25.78510180 -81.08312760
  108. 106 ROBERTS LAKE STRAND 25.78910000 -81.10010000
  109. 107 ROBERTS LAKE STRAND 25.78908330 -81.10011110
  110. 108 Roberts Lake Strand off Loop Road Nr Monroe St., F 25.78760000 -81.09896000
  111. 109 SF1-LR-2003 TAMIAMI CANAL 25.87776690 -81.23770300
  112. 110 SF1-LR-2013 TAMIAMI CANAL 25.84714330 -80.93604100
  113. 111 SF1-LR-2018 TAMIAMI CANAL 25.82152330 -80.89179600
  114. 112 SF1-LR-2027 UNNAMED SMALL STREAM 25.83345630 -80.84754300
  115. 113 SF1-LR-2030 TAMIAMI CANAL 25.90154440 -81.31660800
  116. 114 SF1-SS-2127 UNNAMED SMALL STREAM 26.08723800 -81.26474300
  117. 115 SF5-LR-2029 L-28 25.93417020 -80.83276100
  118. 116 SFC-HS-1004 UNKNOWN 25.86556940 -80.84373860
  119. 117 SFC-HS-1015 Unknown 25.85072220 -80.97177700
  120. 118 SFC-HS-1015 UNKNOWN 25.85073690 -80.97178420
  121. 119 SFC-HS-1017 UNKNOWN 26.16789830 -81.16496810
  122. 120 SFC-HS-1020 UNKNOWN 25.86967640 -81.15835360
  123. 121 SFC-HS-1021 UNKNOWN 26.04585440 -81.26361810
  124. 122 SFC-HS-1027 UNKNOWN 26.16681810 -81.22861080
  125. 123 SFC-HS-1029 UNKNOWN 25.76117440 -80.88047420
  126. 124 SFC-HS-1030 UNKNOWN 25.75741690 -80.98731940
  127. 125 SFC-HS-1031 UNKNOWN 26.15561610 -81.29818250
  128. 126 SFC-HS-1032 UNKNOWN 25.98263080 -81.26222940
  129. 127 SOUND, ATLANTIC, LONG KEY BRIDGE 25.79000000 -80.87640000
  130. 128 SOUND, ATLANTIC, TOM@S HARBOR CU 25.77560000 -80.90440000
  131. 129 South canal@ Monroe Station 25.86361110 -81.10116600
  132. 130 SWEETWATER STRAND AT LOOP RD. NR MONROE STATION FL 25.78857220 -81.09992220
  133. 131 TAMBR115 25.82480000 -80.89610000
  134. 132 TAMBR90 25.87243610 -81.18668880
  135. 133 TAMIAMI C AT 40 MI BEND,NR MIAMI,FLA. 25.76426800 -80.83034310
  136. 134 Tamiami Canal - Near Big Cypress Visitors Center 25.90145000 -81.32419000
  137. 135 TAMIAMI CANAL 4 M WEST OF 40 M BEND 25.80220000 -80.86970000
  138. 136 TAMIAMI CANAL AT 40-MILE BEND, NEAR MIAMI, FLA. 25.76426800 -80.83034310
  139. 137 TAMIAMI CANAL AT BR 86 NEAR OCHOPEE FLA (AUX) 25.87537680 -81.22479730
  140. 138 tamiami canal at bridge 030114 nr miami, fl 25.83288000 -80.90423000
  141. 139 TAMIAMI CANAL AT BRIDGE 105 NR MONROE, FL 25.85156670 -80.98093330
  142. 140 TAMIAMI CANAL AT BRIDGE 115, NEAR MIAMI, FLA. 25.82537710 -80.89562230
  143. 141 TAMIAMI CANAL AT BRIDGE 96, AT MONROE, FLA. 25.86426540 -81.09979420
  144. 142 TAMIAMI CANAL AT COLLIER CO. LINE 25.80565550 -80.87312180
  145. 143 TAMIAMI CANAL AT INTERSECTION OF S.R. 839 AND U.S. 41 EAST 25.90361110 -81.31416670
  146. 144 tamiami canal culvert below s343b nr miami, fl 25.78816000 -80.85507000
  147. 145 TAMIAMI CANAL OCHOPEE 25.90060000 -81.30420000
  148. 146 TAMIAMI CANAL OUTLETS AT BRIDGE 83 25.89120970 -81.27007620
  149. 147 TAMIAMI CANAL OUTLETS, 40-MILE BEND TO MONROE, FL 25.85176530 -80.98034670
  150. 148 TAMIAMI CANAL OUTLETS, MONROE TO CARNESTOWN, FLA 25.88772090 -81.26163430
  151. 149 TAMIAMI CN AT HIGHWAY 25.84954300 -80.95590160
  152. 150 TURNER 25.89080000 -81.26980000
  153. 151 Turner R. @ US 41 25.89079900 -81.26981100
  154. 152 TURNER RIVER CANAL AT ALLIGATOR ALLEY 26.16564300 -81.24701870
  155. 153 TURNER RIVER NORTH OF US-41 25.89843160 -81.26479820
  156. 154 US 41 Bridge #030115 25.82480550 -80.89611100
  157. 155 US 41 Canal 2.6 mi. E. of Turner River 25.87638880 -81.22811100
  158. 156 WATER QUALITY MONITORING STATION 25.85185333 -80.98064944
  159. 157 WEST MUD LAKE 26.05666670 -81.15583330
  160. 158 Z5-CN-11014 UNNAMED CANAL 25.86503330 -80.84377500
  161. 159 Z5-CN-14007 UNNAMED CANAL 25.79064721 -80.85629750
  162. 160 Z5-LR-3013 Tamiami Canal 25.87581130 -81.21879600
  163. 161 Z5-LR-3013R Tamiami Canal 25.87584970 -81.21883100
  164. 162 Z5-LR-3014R L-28 25.80405520 -80.85364100
  165. 163 Z5-SS-4079 TURNER RIVER 25.89176970 -81.26959000
  166. 164 Z5-SS-4079R TURNER RIVER 25.89184940 -81.26953600

All station names are unique, but a few station names have the same coordinates.

Ultimately, I would like the output to be the same as the map.dat object, but with an extra $group column that has an ID number for each isolated station and each group/chain of nearby stations.

答案1

得分: 2

以下是您提供的代码的翻译:

  1. library(geodist)
  2. # swap order for lon-lat
  3. df2 <- data.frame(lon = df1$long, lat = df1$lat)
  4. dist_matrix <- geodist(df2)
  5. library(tidyverse)
  6. as.data.frame(dist_matrix) %>%
  7. mutate(row = paste0("V", row_number())) %>%
  8. pivot_longer(-row, names_to = "match") %>%
  9. filter(value < 200, row != match) %>%
  10. filter(row < match) # if we only want one row per link

这段代码计算了所有距离小于200米的点对。

  1. # A tibble: 143 × 3
  2. row match value
  3. <chr> <chr> <dbl>
  4. 1 V1 V81 30.1
  5. 2 V11 V17 72.3
  6. 3 V11 V20 74.0
  7. 4 V11 V29 68.8
  8. 5 V11 V42 122.
  9. 6 V11 V43 118.
  10. 7 V11 V44 79.2
  11. 8 V11 V143 61.0
  12. 9 V12 V18 38.4
  13. 10 V12 V150 47.6
  14. # ℹ 133 more rows
  15. # ℹ Use `print(n = ...)` to see more rows

这是计算结果,包含了所有距离小于200米的点对的距离数据。

对于您提到的关于如何分组的问题,是否将A与B和B与C视为一个群组还是两个群组,这取决于您的需求。如果您想根据连接来分组,您可以考虑使用 igraphtidygraph 来基于链接分配群组。可以参考以下链接:https://igraph.org/r/doc/cliques.html

最后,您提供的代码中还包含了示例数据(df1),这是用于计算距离的经纬度数据。

英文:
  1. library(geodist)
  2. # swap order for lon-lat
  3. df2 &lt;- data.frame(lon = df1$long, lat = df1$lat)
  4. dist_matrix &lt;- geodist(df2)
  5. library(tidyverse)
  6. as.data.frame(dist_matrix) %&gt;%
  7. mutate(row = paste0(&quot;V&quot;, row_number())) %&gt;%
  8. pivot_longer(-row, names_to = &quot;match&quot;) %&gt;%
  9. filter(value &lt; 200, row != match) %&gt;%
  10. filter(row &lt; match) # if we only want one row per link

This outputs all the pairs within 200 meters:

  1. # A tibble: 143 &#215; 3
  2. row match value
  3. &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt;
  4. 1 V1 V81 30.1
  5. 2 V11 V17 72.3
  6. 3 V11 V20 74.0
  7. 4 V11 V29 68.8
  8. 5 V11 V42 122.
  9. 6 V11 V43 118.
  10. 7 V11 V44 79.2
  11. 8 V11 V143 61.0
  12. 9 V12 V18 38.4
  13. 10 V12 V150 47.6
  14. # ℹ 133 more rows
  15. # ℹ Use `print(n = ...)` to see more rows

I'm unclear on how you want the grouping to work if, say, A is 150 meters from B, and B is 150 meters from C, but A is >200 meters from C. Are those one group or two? I would probably turn to igraph/tidygraph to assign clusters ("cliques") based on links, but not sure how that should be implemented.

https://igraph.org/r/doc/cliques.html


Sample data

  1. df1 &lt;- data.frame(
  2. lat = c(25.75038,25.78788,25.75816,25.85593,
  3. 25.93371,25.95037,25.86704,25.84482,25.83371,25.87538,
  4. 25.88729,25.87676,26.1682,25.7476,25.7903,25.7739,25.8878,
  5. 25.87664,26.22222,25.88764,25.89092,25.99452,26.19736,
  6. 25.90036,26.17218,26.09577,26.16925,25.77806,25.88778,25.87639,
  7. 25.8519,25.85192,25.85222,25.87222,25.86385,25.86361,
  8. 25.84341,25.84306,25.8903,25.89028,25.89056,25.8865,25.88653,
  9. 25.88694,25.7619,25.78544,26.04472,26.0443,26.04431,
  10. 26.0928,26.09278,26.16,26.196,26.19597,26.19639,26.1625,
  11. 26.1639,26.1653,26.0944,25.7889,25.8528,25.8444,25.8736,
  12. 25.9472,25.91917,25.8431,25.78611,25.7135,25.71347,25.71389,
  13. 25.7208,25.7639,26.1913,26.19128,26.19167,25.68722,26.205,
  14. 25.78861,25.76027,25.7605,25.75065,25.76126,25.74659,
  15. 25.8179,25.8636,25.9573,25.95725,25.9575,26.1564,26.15644,
  16. 26.15694,26.1556,25.78371,25.77861,25.78389,25.7782,
  17. 25.77822,25.9684,25.96839,25.96889,25.78944,25.7851,25.7891,
  18. 25.78908,25.7876,25.87777,25.84714,25.82152,25.83346,
  19. 25.90154,26.08724,25.93417,25.86557,25.85072,25.85074,26.1679,
  20. 25.86968,26.04585,26.16682,25.76117,25.75742,26.15562,
  21. 25.98263,25.79,25.7756,25.86361,25.78857,25.8248,25.87244,
  22. 25.76427,25.90145,25.8022,25.83288,25.85157,25.82538,
  23. 25.86427,25.80566,25.90361,25.78816,25.9006,25.89121,25.85177,
  24. 25.88772,25.84954,25.8908,25.8908,26.16564,25.89843,
  25. 25.82481,25.87639,25.85185,26.05667,25.86503,25.79065,
  26. 25.87581,25.87585,25.80406,25.89177,25.89185),
  27. long = c(-80.96646,-81.09702,-80.99451,-80.89979,
  28. -80.81229,-80.83312,-81.09979,-80.93035,-80.88312,
  29. -81.2248,-81.26125,-81.22787,-81.0882,-80.94979,-80.8911,
  30. -80.9339,-81.2617,-81.22823,-81.17222,-81.26188,-81.26972,
  31. -81.2627,-81.26716,-81.26199,-81.26681,-81.26506,-81.08729,
  32. -80.84444,-81.26167,-81.21778,-80.981,-80.98103,-80.98083,
  33. -81.01861,-81.10096,-81.10111,-80.9172,-80.91778,-81.2703,
  34. -81.27025,-81.27,-81.2621,-81.26208,-81.26194,-80.8533,
  35. -80.85119,-81.29972,-81.2999,-81.29992,-81.0539,-81.05392,
  36. -81.22639,-81.2887,-81.28869,-81.28861,-81.2417,-81.1736,
  37. -81.0931,-81.2667,-80.8569,-81.0292,-80.9708,-81.2292,
  38. -81.2625,-80.83639,-80.9177,-81.20056,-81.0219,-81.02192,
  39. -81.02167,-80.8722,-81.075,-81.0868,-81.08675,-81.08667,
  40. -80.91972,-81.16833,-81.09991,-81.04831,-80.99626,-80.96644,
  41. -80.90782,-80.9539,-81.10038,-81.1012,-81.1038,-81.10383,
  42. -81.10361,-81.2219,-81.22192,-81.22167,-81.2665,-81.19146,
  43. -80.91194,-80.92528,-80.9122,-80.91222,-80.9264,-80.92636,
  44. -80.92611,-81.1,-81.08313,-81.1001,-81.10011,-81.09896,
  45. -81.2377,-80.93604,-80.8918,-80.84754,-81.31661,-81.26474,
  46. -80.83276,-80.84374,-80.97178,-80.97178,-81.16497,-81.15835,
  47. -81.26362,-81.22861,-80.88047,-80.98732,-81.29818,
  48. -81.26223,-80.8764,-80.9044,-81.10117,-81.09992,-80.8961,
  49. -81.18669,-80.83034,-81.32419,-80.8697,-80.90423,-80.98093,
  50. -80.89562,-81.09979,-80.87312,-81.31417,-80.85507,-81.3042,
  51. -81.27008,-80.98035,-81.26163,-80.9559,-81.2698,-81.26981,
  52. -81.24702,-81.2648,-80.89611,-81.22811,-80.98065,-81.15583,
  53. -80.84377,-80.8563,-81.2188,-81.21883,-80.85364,-81.26959,
  54. -81.26954)
  55. )

答案2

得分: 1

string = "
[1,] 25.75038 -80.96646
[2,] 25.78788 -81.09702
[3,] 25.75816 -80.99451
[4,] 25.85593 -80.89979
[5,] 25.93371 -80.81229
[6,] 25.95037 -80.83312
[7,] 25.86704 -81.09979
[8,] 25.84482 -80.93035
[9,] 25.83371 -80.88312
[10,] 25.87538 -81.22480
[11,] 25.88729 -81.26125
[12,] 25.87676 -81.22787
[13,] 26.16820 -81.08820
[14,] 25.74760 -80.94979
[15,] 25.79030 -80.89110
[16,] 25.77390 -80.93390
[17,] 25.88780 -81.26170
[18,] 25.87664 -81.22823
[19,] 26.22222 -81.17222
[20,] 25.88764 -81.26188
[21,] 25.89092 -81.26972
[22,] 25.99452 -81.26270
[23,] 26.19736 -81.26716
[24,] 25.90036 -81.26199
[25,] 26.17218 -81.26681
[26,] 26.09577 -81.26506
[27,] 26.16925 -81.08729
[28,] 25.77806 -80.84444
[29,] 25.88778 -81.26167
[30,] 25.87639 -81.21778
[31,] 25.85190 -80.98100
[32,] 25.85192 -80.98103
[33,] 25.85222 -80.98083
[34,] 25.87222 -81.01861
[35,] 25.86385 -81.10096
[36,] 25.86361 -81.10111
[37,] 25.84341 -80.91720
[38,] 25.84306 -80.91778
[39,] 25.89030 -81.27030
[40,] 25.89028 -81.27025
[41,] 25.89056 -81.27000
[42,] 25.88650 -81.26210
[43,] 25.88653 -81.26208
[44,] 25.88694 -81.26194
[45,] 25.76190 -80.85330
[46,] 25.78544 -80.85119
[47,] 26.04472 -81.29972
[48,] 26.04430 -81.29990
[49,] 26.04431 -81.29992
[50,] 26.09280 -81.05390
[51,] 26.09278 -81.05392
[52,] 26.16000 -81.22639
[53,] 26.19600 -81.28870
[54,] 26.19597 -81.28869
[55,] 26.19639 -81.28861
[56,] 26.16250 -81.24170
[57,] 26.16390 -81.17360
[58,] 26.16530 -81.09310
[59,] 26.09440 -81.26670
[60,] 25.78890 -80.85690
[61,] 25.85280 -81.02920
[62,] 25.84440 -80.97080
[63,] 25.87360 -81.22920
[64,] 25.94720 -81.26250
[65,] 25.91917 -80.83639
[66,] 25.84310 -80.91770
[67,] 25.78611 -81.20056
[68,] 25.71350 -81.02190
[69,] 25.71347 -81.02192
[70,] 25.71389 -81.02167
[71,] 25.72080 -80.87220
[72,] 25.76390 -81.07500
[73,] 26.19130 -81.08680
[74,] 26.19128 -81.08675
[75,] 26.19167 -81.08667
[76,] 25.68722 -80.91972
[77,] 26.20500 -81.16833
[78,] 25.78861 -81.09991
[79,] 25.76027 -81.04831
[80,] 25.76050 -80.99626
[81,] 25.75065 -80.96644
[82,] 25.76126 -80.90782
[83,] 25.74659 -80.95390
[84,] 25.81790 -81.10038
[85,] 25.86360 -81.10120
[86,] 25.95730 -81.10380
[87,] 25.95725 -81.10383
[88,] 25.95750 -81.10361
[89,] 26.15640 -81.22190
[90,] 26.15644 -81.22192
[91,] 26.15694 -81.22167
[92,] 26.15560 -81.26650
[93,] 25.78371 -81.19146
[94,] 25.77861 -80.91194
[95,] 25.78389 -80.92528
[96,] 25.77820 -80.91220
[97,] 25.77822 -80.91222
[98,] 25.96840 -80.92640
[99,] 25.96839 -80.92636
[100,] 25.96889 -80.92611
[101,] 25.78944 -81.10000
[102,] 25.78510 -81.08313
[103,

英文:
  1. string = &quot;
  2. [1,] 25.75038 -80.96646
  3. [2,] 25.78788 -81.09702
  4. [3,] 25.75816 -80.99451
  5. [4,] 25.85593 -80.89979
  6. [5,] 25.93371 -80.81229
  7. [6,] 25.95037 -80.83312
  8. [7,] 25.86704 -81.09979
  9. [8,] 25.84482 -80.93035
  10. [9,] 25.83371 -80.88312
  11. [10,] 25.87538 -81.22480
  12. [11,] 25.88729 -81.26125
  13. [12,] 25.87676 -81.22787
  14. [13,] 26.16820 -81.08820
  15. [14,] 25.74760 -80.94979
  16. [15,] 25.79030 -80.89110
  17. [16,] 25.77390 -80.93390
  18. [17,] 25.88780 -81.26170
  19. [18,] 25.87664 -81.22823
  20. [19,] 26.22222 -81.17222
  21. [20,] 25.88764 -81.26188
  22. [21,] 25.89092 -81.26972
  23. [22,] 25.99452 -81.26270
  24. [23,] 26.19736 -81.26716
  25. [24,] 25.90036 -81.26199
  26. [25,] 26.17218 -81.26681
  27. [26,] 26.09577 -81.26506
  28. [27,] 26.16925 -81.08729
  29. [28,] 25.77806 -80.84444
  30. [29,] 25.88778 -81.26167
  31. [30,] 25.87639 -81.21778
  32. [31,] 25.85190 -80.98100
  33. [32,] 25.85192 -80.98103
  34. [33,] 25.85222 -80.98083
  35. [34,] 25.87222 -81.01861
  36. [35,] 25.86385 -81.10096
  37. [36,] 25.86361 -81.10111
  38. [37,] 25.84341 -80.91720
  39. [38,] 25.84306 -80.91778
  40. [39,] 25.89030 -81.27030
  41. [40,] 25.89028 -81.27025
  42. [41,] 25.89056 -81.27000
  43. [42,] 25.88650 -81.26210
  44. [43,] 25.88653 -81.26208
  45. [44,] 25.88694 -81.26194
  46. [45,] 25.76190 -80.85330
  47. [46,] 25.78544 -80.85119
  48. [47,] 26.04472 -81.29972
  49. [48,] 26.04430 -81.29990
  50. [49,] 26.04431 -81.29992
  51. [50,] 26.09280 -81.05390
  52. [51,] 26.09278 -81.05392
  53. [52,] 26.16000 -81.22639
  54. [53,] 26.19600 -81.28870
  55. [54,] 26.19597 -81.28869
  56. [55,] 26.19639 -81.28861
  57. [56,] 26.16250 -81.24170
  58. [57,] 26.16390 -81.17360
  59. [58,] 26.16530 -81.09310
  60. [59,] 26.09440 -81.26670
  61. [60,] 25.78890 -80.85690
  62. [61,] 25.85280 -81.02920
  63. [62,] 25.84440 -80.97080
  64. [63,] 25.87360 -81.22920
  65. [64,] 25.94720 -81.26250
  66. [65,] 25.91917 -80.83639
  67. [66,] 25.84310 -80.91770
  68. [67,] 25.78611 -81.20056
  69. [68,] 25.71350 -81.02190
  70. [69,] 25.71347 -81.02192
  71. [70,] 25.71389 -81.02167
  72. [71,] 25.72080 -80.87220
  73. [72,] 25.76390 -81.07500
  74. [73,] 26.19130 -81.08680
  75. [74,] 26.19128 -81.08675
  76. [75,] 26.19167 -81.08667
  77. [76,] 25.68722 -80.91972
  78. [77,] 26.20500 -81.16833
  79. [78,] 25.78861 -81.09991
  80. [79,] 25.76027 -81.04831
  81. [80,] 25.76050 -80.99626
  82. [81,] 25.75065 -80.96644
  83. [82,] 25.76126 -80.90782
  84. [83,] 25.74659 -80.95390
  85. [84,] 25.81790 -81.10038
  86. [85,] 25.86360 -81.10120
  87. [86,] 25.95730 -81.10380
  88. [87,] 25.95725 -81.10383
  89. [88,] 25.95750 -81.10361
  90. [89,] 26.15640 -81.22190
  91. [90,] 26.15644 -81.22192
  92. [91,] 26.15694 -81.22167
  93. [92,] 26.15560 -81.26650
  94. [93,] 25.78371 -81.19146
  95. [94,] 25.77861 -80.91194
  96. [95,] 25.78389 -80.92528
  97. [96,] 25.77820 -80.91220
  98. [97,] 25.77822 -80.91222
  99. [98,] 25.96840 -80.92640
  100. [99,] 25.96839 -80.92636
  101. [100,] 25.96889 -80.92611
  102. [101,] 25.78944 -81.10000
  103. [102,] 25.78510 -81.08313
  104. [103,] 25.78910 -81.10010
  105. [104,] 25.78908 -81.10011
  106. [105,] 25.78760 -81.09896
  107. [106,] 25.87777 -81.23770
  108. [107,] 25.84714 -80.93604
  109. [108,] 25.82152 -80.89180
  110. [109,] 25.83346 -80.84754
  111. [110,] 25.90154 -81.31661
  112. [111,] 26.08724 -81.26474
  113. [112,] 25.93417 -80.83276
  114. [113,] 25.86557 -80.84374
  115. [114,] 25.85072 -80.97178
  116. [115,] 25.85074 -80.97178
  117. [116,] 26.16790 -81.16497
  118. [117,] 25.86968 -81.15835
  119. [118,] 26.04585 -81.26362
  120. [119,] 26.16682 -81.22861
  121. [120,] 25.76117 -80.88047
  122. [121,] 25.75742 -80.98732
  123. [122,] 26.15562 -81.29818
  124. [123,] 25.98263 -81.26223
  125. [124,] 25.79000 -80.87640
  126. [125,] 25.77560 -80.90440
  127. [126,] 25.86361 -81.10117
  128. [127,] 25.78857 -81.09992
  129. [128,] 25.82480 -80.89610
  130. [129,] 25.87244 -81.18669
  131. [130,] 25.76427 -80.83034
  132. [131,] 25.90145 -81.32419
  133. [132,] 25.80220 -80.86970
  134. [133,] 25.83288 -80.90423
  135. [134,] 25.85157 -80.98093
  136. [135,] 25.82538 -80.89562
  137. [136,] 25.86427 -81.09979
  138. [137,] 25.80566 -80.87312
  139. [138,] 25.90361 -81.31417
  140. [139,] 25.78816 -80.85507
  141. [140,] 25.90060 -81.30420
  142. [141,] 25.89121 -81.27008
  143. [142,] 25.85177 -80.98035
  144. [143,] 25.88772 -81.26163
  145. [144,] 25.84954 -80.95590
  146. [145,] 25.89080 -81.26980
  147. [146,] 25.89080 -81.26981
  148. [147,] 26.16564 -81.24702
  149. [148,] 25.89843 -81.26480
  150. [149,] 25.82481 -80.89611
  151. [150,] 25.87639 -81.22811
  152. [151,] 25.85185 -80.98065
  153. [152,] 26.05667 -81.15583
  154. [153,] 25.86503 -80.84377
  155. [154,] 25.79065 -80.85630
  156. [155,] 25.87581 -81.21880
  157. [156,] 25.87585 -81.21883
  158. [157,] 25.80406 -80.85364
  159. [158,] 25.89177 -81.26959
  160. [159,] 25.89185 -81.26954&quot;
  161. library(stringr)
  162. df_coords &lt;- data.frame(
  163. loc = paste0(&quot;loc&quot;,1:159),
  164. lat = as.numeric(str_extract_all(string,&quot;\\d+\\.\\d+(?= \\-)&quot;)[[1]]),
  165. long = as.numeric(str_extract_all(string,&quot;-\\d+\\.\\d+&quot;)[[1]])
  166. )
  167. library(plyr)
  168. d = expand.grid(loc1 = df_coords$loc,loc2 = df_coords$loc)
  169. library(dplyr)
  170. d &lt;- d %&gt;%
  171. inner_join(
  172. df_coords, by = c(&quot;loc1&quot;=&quot;loc&quot;), suffix=c(&quot;&quot;,&quot;loc1&quot;)
  173. ) %&gt;%
  174. inner_join(
  175. df_coords, by = c(&quot;loc2&quot;=&quot;loc&quot;), suffix=c(&quot;&quot;,&quot;loc2&quot;)
  176. )
  177. library(geosphere)
  178. d &lt;- d %&gt;%
  179. rowwise() %&gt;%
  180. mutate(
  181. dist = distm(c(long, lat), c(longloc2, latloc2), fun = distHaversine)
  182. ) %&gt;%
  183. filter(dist&gt;0,dist&lt;200)

答案3

得分: 1

spdep::dnearneigh() 可以找到基于距离的邻居,n.comp.nb() 处理集群 ID。

  1. library(spdep)
  2. library(sf)
  3. library(dplyr)
  4. library(ggplot2)
  5. # 将数据框转换为 sf 对象,
  6. # 找到基于距离的邻居,上限距离设置为0.2公里,
  7. # 识别连接的子图/集群并将集群大小存储在n中
  8. dat_sf <- st_as_sf(df1, coords = c("long", "lat"), crs = "WGS84", remove = FALSE) %>%
  9. mutate(comp_id = dnearneigh(geometry, 0, 0.2) %>% n.comp.nb() %>% getElement("comp.id")) %>%
  10. add_count(comp_id)
  11. dat_sf
  12. #> 简单要素集合,包含 159 个要素和 4 个字段
  13. #> 几何类型: 点
  14. #> 坐标系: WGS 84
  15. #> 元素: 159
  16. #> Bounding box: xmin: -81.32419 ymin: 25.68722 xmax: -80.81229 ymax: 26.22222
  17. #> 经度/纬度坐标系 (WGS 84):
  18. #> 纬度 经度 comp_id n geometry
  19. #> 1 25.75038 -80.96646 1 2 POINT (-80.96646 25.75038)
  20. #> 2 25.78788 -81.09702 2 7 POINT (-81.09702 25.78788)
  21. #> 3 25.75816 -80.99451 3 1 POINT (-80.99451 25.75816)
  22. #> 4 25.85593 -80.89979 4 1 POINT (-80.89979 25.85593)
  23. #> 5 25.93371 -80.81229 5 1 POINT (-80.81229 25.93371)
  24. #> 6 25.95037 -80.83312 6 1 POINT (-80.83312 25.95037)
  25. #> ... 以下省略部分内容
所有点,按簇大小分类为 > 1
  1. dat_sf %>%
  2. ggplot() +
  3. geom_sf(aes(color = n > 1), size = 2, shape = 4) +
  4. ggspatial::annotation_scale()

如何根据R中坐标之间的距离将数据点分组在一起?

半随机子集,按簇 ID (comp_id) 分类
  1. st_crop(dat_sf, xmin = -81.30, ymin = 25.80, xmax = -81.20, ymax = 25.90) %>%
  2. ggplot() +
  3. geom_sf(aes(color = as.factor(comp_id)), size = 2, shape = 4) +
  4. ggspatial::annotation_scale()
  5. #> 警告: 属性变量被假定在所有几何体中是空间上恒定的

如何根据R中坐标之间的距离将数据点分组在一起?

  1. # 如果需要,将数据还原为常规数据框
  2. st_drop_geometry(dat_sf) %>% head()
  3. #> 纬度 经度 comp_id n
  4. #> 1 25.75038 -80.96646 1 2
  5. #> 2 25.78788 -81.09702 2 7
  6. #> 3 25.75816 -80.99451 3 1
  7. #> 4 25.85593 -80.89979 4 1
  8. #> 5 25.93371 -80.81229 5 1
  9. #> 6 25.95037 -80.83312 6 1

输入数据:

  1. df1 <- structure(list(lat = c(25.75038, 25.78788, 25.75816, 25.85593,
  2. 25.93371, 25.95037, 25.86704, 25.84482, 25.83371, 25.87538, 25.88729,
  3. 25.87676, 26.1682, 25.7476, 25.7903, 25.7739, 25.8878, 25.87664,
  4. 26.22222, 25.88764, 25.89092, 25.99452, 26.19736, 25.90036, 26.17218,
  5. 26.09577, 26.16925, 25.77806, 25.88778, 25.87639, 25.8519, 25.85192,
  6. 25.85222, 25.87222, 25.86385, 25.86361, 25.84341, 25.84306, 25.8903,
  7. 25.89028, 25.89056, 25.8865, 25.88653, 25.88694, 25.7619, 25.78544,
  8. 26.04472, 26.0443, 26.04431, 26.0928, 26.09278, 26.16, 26.196,
  9. 26.19597, 26.19639, 26.1625, 26.1639, 26.1653, 26.0944, 25.7889,
  10. 25.8528, 25.8444, 25.8736, 25.9472, 25.91917, 25.8431, 25.78611,
  11. 25.7135, 25.71347, 25.71389, 25.7208, 25.7639, 26.1913, 26.19128,
  12. 26.19167, 25.68722, 26.205, 25.78861, 25.76027, 25.7605, 25.75065,
  13. 25.76126, 25.74659, 25.8179, 25.8636, 25.9573, 25.95725, 25.9575,
  14. 26.1564, 26.15644, 26.15694, 26.1556, 25.78371, 25.77861, 25.78389,
  15. 25.7782, 25.77822, 25.9684, 25.96839, 25
  16. <details>
  17. <summary>英文:</summary>
  18. [`spdep::dnearneigh()`][1] can find distance-based neighbours and `n.comp.nb()` deals with cluster id-s.
  19. ``` r
  20. library(spdep)
  21. library(sf)
  22. library(dplyr)
  23. library(ggplot2)
  24. # convert dataframe to sf object,
  25. # find distance-based neigbours, upper distance bound is set ti 0.2km,
  26. # identify connected subgraphs / clusters and store cluster sizes in n
  27. dat_sf &lt;- st_as_sf(df1, coords = c(&quot;long&quot;, &quot;lat&quot;), crs = &quot;WGS84&quot;, remove = FALSE) %&gt;%
  28. mutate(comp_id = dnearneigh(geometry, 0, 0.2) %&gt;% n.comp.nb() %&gt;% getElement(&quot;comp.id&quot;)) %&gt;%
  29. add_count(comp_id)
  30. dat_sf
  31. #&gt; Simple feature collection with 159 features and 4 fields
  32. #&gt; Geometry type: POINT
  33. #&gt; Dimension: XY
  34. #&gt; Bounding box: xmin: -81.32419 ymin: 25.68722 xmax: -80.81229 ymax: 26.22222
  35. #&gt; Geodetic CRS: WGS 84
  36. #&gt; First 10 features:
  37. #&gt; lat long comp_id n geometry
  38. #&gt; 1 25.75038 -80.96646 1 2 POINT (-80.96646 25.75038)
  39. #&gt; 2 25.78788 -81.09702 2 7 POINT (-81.09702 25.78788)
  40. #&gt; 3 25.75816 -80.99451 3 1 POINT (-80.99451 25.75816)
  41. #&gt; 4 25.85593 -80.89979 4 1 POINT (-80.89979 25.85593)
  42. #&gt; 5 25.93371 -80.81229 5 1 POINT (-80.81229 25.93371)
  43. #&gt; 6 25.95037 -80.83312 6 1 POINT (-80.83312 25.95037)
  44. #&gt; 7 25.86704 -81.09979 7 1 POINT (-81.09979 25.86704)
  45. #&gt; 8 25.84482 -80.93035 8 1 POINT (-80.93035 25.84482)
  46. #&gt; 9 25.83371 -80.88312 9 1 POINT (-80.88312 25.83371)
  47. #&gt; 10 25.87538 -81.22480 10 1 POINT (-81.2248 25.87538)
All points, classified by cluster size being > 1
  1. dat_sf %&gt;%
  2. ggplot() +
  3. geom_sf(aes(color = n &gt; 1), size = 2, shape = 4) +
  4. ggspatial::annotation_scale()

如何根据R中坐标之间的距离将数据点分组在一起?<!-- -->

Semi-random subset, classified by cluster id (comp_id)
  1. st_crop(dat_sf, xmin = -81.30, ymin = 25.80, xmax = -81.20, ymax = 25.90) %&gt;%
  2. ggplot() +
  3. geom_sf(aes(color = as.factor(comp_id)), size = 2, shape = 4) +
  4. ggspatial::annotation_scale()
  5. #&gt; Warning: attribute variables are assumed to be spatially constant throughout
  6. #&gt; all geometries

如何根据R中坐标之间的距离将数据点分组在一起?<!-- -->

  1. # back to regular data.frame, if needed
  2. st_drop_geometry(dat_sf) %&gt;% head()
  3. #&gt; lat long comp_id n
  4. #&gt; 1 25.75038 -80.96646 1 2
  5. #&gt; 2 25.78788 -81.09702 2 7
  6. #&gt; 3 25.75816 -80.99451 3 1
  7. #&gt; 4 25.85593 -80.89979 4 1
  8. #&gt; 5 25.93371 -80.81229 5 1
  9. #&gt; 6 25.95037 -80.83312 6 1

Input data:

  1. df1 &lt;- structure(list(lat = c(25.75038, 25.78788, 25.75816, 25.85593,
  2. 25.93371, 25.95037, 25.86704, 25.84482, 25.83371, 25.87538, 25.88729,
  3. 25.87676, 26.1682, 25.7476, 25.7903, 25.7739, 25.8878, 25.87664,
  4. 26.22222, 25.88764, 25.89092, 25.99452, 26.19736, 25.90036, 26.17218,
  5. 26.09577, 26.16925, 25.77806, 25.88778, 25.87639, 25.8519, 25.85192,
  6. 25.85222, 25.87222, 25.86385, 25.86361, 25.84341, 25.84306, 25.8903,
  7. 25.89028, 25.89056, 25.8865, 25.88653, 25.88694, 25.7619, 25.78544,
  8. 26.04472, 26.0443, 26.04431, 26.0928, 26.09278, 26.16, 26.196,
  9. 26.19597, 26.19639, 26.1625, 26.1639, 26.1653, 26.0944, 25.7889,
  10. 25.8528, 25.8444, 25.8736, 25.9472, 25.91917, 25.8431, 25.78611,
  11. 25.7135, 25.71347, 25.71389, 25.7208, 25.7639, 26.1913, 26.19128,
  12. 26.19167, 25.68722, 26.205, 25.78861, 25.76027, 25.7605, 25.75065,
  13. 25.76126, 25.74659, 25.8179, 25.8636, 25.9573, 25.95725, 25.9575,
  14. 26.1564, 26.15644, 26.15694, 26.1556, 25.78371, 25.77861, 25.78389,
  15. 25.7782, 25.77822, 25.9684, 25.96839, 25.96889, 25.78944, 25.7851,
  16. 25.7891, 25.78908, 25.7876, 25.87777, 25.84714, 25.82152, 25.83346,
  17. 25.90154, 26.08724, 25.93417, 25.86557, 25.85072, 25.85074, 26.1679,
  18. 25.86968, 26.04585, 26.16682, 25.76117, 25.75742, 26.15562, 25.98263,
  19. 25.79, 25.7756, 25.86361, 25.78857, 25.8248, 25.87244, 25.76427,
  20. 25.90145, 25.8022, 25.83288, 25.85157, 25.82538, 25.86427, 25.80566,
  21. 25.90361, 25.78816, 25.9006, 25.89121, 25.85177, 25.88772, 25.84954,
  22. 25.8908, 25.8908, 26.16564, 25.89843, 25.82481, 25.87639, 25.85185,
  23. 26.05667, 25.86503, 25.79065, 25.87581, 25.87585, 25.80406, 25.89177,
  24. 25.89185), long = c(-80.96646, -81.09702, -80.99451, -80.89979,
  25. -80.81229, -80.83312, -81.09979, -80.93035, -80.88312, -81.2248,
  26. -81.26125, -81.22787, -81.0882, -80.94979, -80.8911, -80.9339,
  27. -81.2617, -81.22823, -81.17222, -81.26188, -81.26972, -81.2627,
  28. -81.26716, -81.26199, -81.26681, -81.26506, -81.08729, -80.84444,
  29. -81.26167, -81.21778, -80.981, -80.98103, -80.98083, -81.01861,
  30. -81.10096, -81.10111, -80.9172, -80.91778, -81.2703, -81.27025,
  31. -81.27, -81.2621, -81.26208, -81.26194, -80.8533, -80.85119,
  32. -81.29972, -81.2999, -81.29992, -81.0539, -81.05392, -81.22639,
  33. -81.2887, -81.28869, -81.28861, -81.2417, -81.1736, -81.0931,
  34. -81.2667, -80.8569, -81.0292, -80.9708, -81.2292, -81.2625, -80.83639,
  35. -80.9177, -81.20056, -81.0219, -81.02192, -81.02167, -80.8722,
  36. -81.075, -81.0868, -81.08675, -81.08667, -80.91972, -81.16833,
  37. -81.09991, -81.04831, -80.99626, -80.96644, -80.90782, -80.9539,
  38. -81.10038, -81.1012, -81.1038, -81.10383, -81.10361, -81.2219,
  39. -81.22192, -81.22167, -81.2665, -81.19146, -80.91194, -80.92528,
  40. -80.9122, -80.91222, -80.9264, -80.92636, -80.92611, -81.1, -81.08313,
  41. -81.1001, -81.10011, -81.09896, -81.2377, -80.93604, -80.8918,
  42. -80.84754, -81.31661, -81.26474, -80.83276, -80.84374, -80.97178,
  43. -80.97178, -81.16497, -81.15835, -81.26362, -81.22861, -80.88047,
  44. -80.98732, -81.29818, -81.26223, -80.8764, -80.9044, -81.10117,
  45. -81.09992, -80.8961, -81.18669, -80.83034, -81.32419, -80.8697,
  46. -80.90423, -80.98093, -80.89562, -81.09979, -80.87312, -81.31417,
  47. -80.85507, -81.3042, -81.27008, -80.98035, -81.26163, -80.9559,
  48. -81.2698, -81.26981, -81.24702, -81.2648, -80.89611, -81.22811,
  49. -80.98065, -81.15583, -80.84377, -80.8563, -81.2188, -81.21883,
  50. -80.85364, -81.26959, -81.26954)), class = &quot;data.frame&quot;, row.names = c(NA,
  51. -159L))

<sup>Created on 2023-08-05 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年8月5日 01:15:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76837975.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定