手动将GeoJSON解析为数据框架。

huangapple go评论103阅读模式
英文:

Manually parsing geojson into a dataframe

问题

我想逐步将一个.geojson文件转换为一个tibble(或数据框)。

这是一个简化的geojson示例,我将其存储在一个名为test.geojson的文件中(请注意,geometry字段为null,但在这里并不重要):

{"type": "FeatureCollection",
  "features": [
    { "type": "Feature", "properties": { "VAR_1": 31,"VAR_2": "abc","VAR_3": 255 }, "geometry" : null },
    { "type": "Feature", "properties": { "VAR_1": 23,"VAR_2": "def","VAR_3": 876 }, "geometry" : null }
  ]}

期望的结果(在这里假设geometry填充了两个坐标而不是null):

# A tibble: 2 x 3
  VAR_1 VAR_2 VAR_3 geometry
  <dbl> <chr> <dbl> <list>
1    31 abc     255 <dbl [2]>
2    23 def     876 <dbl [2]>

我特别希望得到一个基于{tidyverse}的解决方案。我目前尝试的是遍历每个features,尝试构建一个数据框,但我找不到一种好的方法来添加geometry字段:

# 读取geojson
js <- jsonlite::read_json("test.geojson") 

# 遍历每个feature...
map_dfr(1:length(js$features), .f = function(i){
  
  df <- js$features[[i]]$properties # 这个可以工作,但只导入VAR_1、VAR_2、VAR_3
  
  df |> mutate(geometry = js$features[[i]]$geometry) # 这个不起作用
  
})

注意:可以使用{geojsonsf}直接将其导入为sf对象,使用sf <- geojsonsf::geojson_sf("test.geojson"),但我想逐步进行,并最终得到一个tibble,并理解我在做什么。

非常感谢您的帮助!

英文:

I would like to work step by step to convert a .geojson into a tibble (or dataframe).

Here's a minimalistic geojson example which I stored in a file called test.geojson (note that the geometry field is null but this does not matter here) :

{&quot;type&quot;: &quot;FeatureCollection&quot;,
  &quot;features&quot;: [
    { &quot;type&quot;: &quot;Feature&quot;, &quot;properties&quot;: { &quot;VAR_1&quot;: 31,&quot;VAR_2&quot;: &quot;abc&quot;,&quot;VAR_3&quot;: 255 }, &quot;geometry&quot; : null },
    { &quot;type&quot;: &quot;Feature&quot;, &quot;properties&quot;: { &quot;VAR_1&quot;: 23,&quot;VAR_2&quot;: &quot;def&quot;,&quot;VAR_3&quot;: 876 }, &quot;geometry&quot; : null }
	]}

Desired result (here assuming that geometry is filled with two coordinates instead of being null)

# A tibble: 2 x 3
  VAR_1 VAR_2 VAR_3 geometry
  &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;list&gt;
1    31 abc     255 &lt;dbl [2]&gt;
2    23 def     876 &lt;dbl [2]&gt;

I'd partiularly like a {tidyverse} based solution. What I've been trying for now is iterate through each features to try and build a dataframe but I can't find a way to nicely add the geometry field :

# Read geojson
js &lt;- jsonlite::read_json(&quot;test.geojson&quot;) 

# Iterate through each features ... 
map_dfr(1:length(js$features), .f = function(i){
  
  df &lt;- js$features[[i]]$properties # this works but only importing VAR_1, VAR_2, VAR_3
 
  df |&gt; mutate(geometry = js$features[[i]]$geometry) # this does not work
  
})

Note : One could use {geojsonsf} to import this directly as sf object with sf &lt;- geojsonsf::geojson_sf(&quot;test.geojson&quot;) but I want to do it step by step, ending up on a tibble and understanding what I'm doing.

Thanks a lot for helping !

答案1

得分: 2

假设数据如下所示:

x <- '{ "type": "FeatureCollection",
  "features": [
    { "type": "Feature", "properties": { "VAR_1": 31,"VAR_2": "abc","VAR_3": 255 }, "geometry" : [-74.0060, 40.7128]},
    { "type": "Feature", "properties": { "VAR_1": 23,"VAR_2": "def","VAR_3": 876 }, "geometry" : [-74.0060, 40.7128]}
    ]}'

你可以这样做:

library(tidyverse)

jsonlite::fromJSON(x)$features %>%
  as_tibble() %>%
  select(properties, geometry) %>%
  unnest(properties)

# 输出结果:
# A tibble: 2 × 4
  VAR_1 VAR_2 VAR_3 geometry 
  <int> <chr> <int> <list>   
1    31 abc     255 <dbl [2]>
2    23 def     876 <dbl [2]>
英文:

Assuming the data is something like this:

x &lt;- &#39;{&quot;type&quot;: &quot;FeatureCollection&quot;,
  &quot;features&quot;: [
    { &quot;type&quot;: &quot;Feature&quot;, &quot;properties&quot;: { &quot;VAR_1&quot;: 31,&quot;VAR_2&quot;: &quot;abc&quot;,&quot;VAR_3&quot;: 255 }, &quot;geometry&quot; : [-74.0060, 40.7128]},
    { &quot;type&quot;: &quot;Feature&quot;, &quot;properties&quot;: { &quot;VAR_1&quot;: 23,&quot;VAR_2&quot;: &quot;def&quot;,&quot;VAR_3&quot;: 876 }, &quot;geometry&quot; : [-74.0060, 40.7128]}
    ]}&#39;

You can do this:

library(tidyverse)

jsonlite::fromJSON(x)$features |&gt;
  as_tibble() |&gt;
  select(properties, geometry) |&gt;
  unnest(properties)

# Output:
# A tibble: 2 &#215; 4
  VAR_1 VAR_2 VAR_3 geometry 
  &lt;int&gt; &lt;chr&gt; &lt;int&gt; &lt;list&gt;   
1    31 abc     255 &lt;dbl [2]&gt;
2    23 def     876 &lt;dbl [2]&gt;

huangapple
  • 本文由 发表于 2023年8月9日 17:02:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76866144.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定