英文:
Unnesting/rectangling/flattening a nested list using `tidyr::unnest_longer()`
问题
I've been trying to get my head around the unnesting functions in tidyr and tibblify. I believe you should be able to use unnest_longer() to replicate the more manual methods below of turning this kind of nested list into a tibble, but I've been struggling with the docs a little. A correct example of how to do this would help me immensely:
# Example nested list
nl <- list(time = list("2023-02-06", "2023-02-07", "2023-02-08",
                       "2023-02-09", "2023-02-10", "2023-02-11",
                       "2023-02-12"), 
           precipitation_sum = list(0.9, 0, 0, 0.3, 0, 0, 0))
# one way to do it (extract colnames and construct)
tibble(!!! setNames(map(nl, unlist),names(nl)))
# another way (collect & reduce each sublist)
as_tibble(lapply(nl, function(x) Reduce(c, x)))
# how to use tidyr and unnest_longer? (below is incorrect)
unnest_longer(tibble(nl), col = everything())
英文:
I've been trying to get my head around the unnesting functions in tidyr and tibblify. I believe you should be able to use unnest_longer()  to replicate the more manual methods below of turning this kind of nested list into a tibble, but I've been struggling with the docs a little. A correct example of how to do this would help me immensely:
# Example nested list
nl <- list(time = list("2023-02-06", "2023-02-07", "2023-02-08",
                       "2023-02-09", "2023-02-10", "2023-02-11",
                       "2023-02-12"), 
           precipitation_sum = list(0.9, 0, 0, 0.3, 0, 0, 0))
# one way to do it (extract colnames and construct)
tibble(!!! setNames(map(nl, unlist),names(nl)))
# another way (collect & reduce each sublist)
as_tibble(lapply(nl, function(x) Reduce(c, x)))
# how to use tidyr and unnest_longer? (below is incorrect)
unnest_longer(tibble(nl), col = everything())
答案1
得分: 4
以下是翻译后的代码部分:
library(tibble)
library(tidyr)
as_tibble(nl) %>%
    unnest(cols = where(is.list))
-output
# A tibble: 7 × 2
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0  
或者更紧凑的写法:
library(purrr)
map_dfc(nl, unlist)
# A tibble: 7 × 2
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0  
请注意,上述代码中的R语言代码保持不变,只有注释部分进行了翻译。
英文:
We could use
library(tibble)
library(tidyr)
as_tibble(nl) %>% 
    unnest(cols = where(is.list))
-output
# A tibble: 7 × 2
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0  
Or more compactly
library(purrr)
map_dfc(nl, unlist)
# A tibble: 7 × 2
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0  
答案2
得分: 1
另一个有趣的选项是使用 dmap(以及 dmap 背后的历史):
'purrrlyr 包含一些位于 purrr 和 dplyr 交集处的函数。它们已从 purrr 中移除,以使包更轻量,并且因为它们已被 tidyverse 中的其他解决方案替代。' <https://github.com/hadley/purrrlyr/>
#install.packages("purrrlyr")
library(purrrlyr)
nl %>%
  dmap(unlist)
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0 
英文:
Another intersting option is to use dmap (and the history behind dmap):
'purrrlyr contains some functions that lie at the intersection of purrr and dplyr. They have been removed from purrr in order to make the package lighter and because they have been replaced by other solutions in the tidyverse.' <https://github.com/hadley/purrrlyr/>
#install.packages("purrrlyr")
library(purrrlyr)
nl %>% 
  dmap(unlist)
  time       precipitation_sum
  <chr>                  <dbl>
1 2023-02-06               0.9
2 2023-02-07               0  
3 2023-02-08               0  
4 2023-02-09               0.3
5 2023-02-10               0  
6 2023-02-11               0  
7 2023-02-12               0 
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论