英文:
Is there an alternative to tidyr::unnest_wider? It fails when nested list element is not a vector
问题
我有旧的代码,以前可以使用 tidyr::unnest_wider()
来将嵌套的命名列表展开成它们自己的列;然而,现在不再起作用。相反,我收到一个错误,错误消息是 x$name_of_list
必须是一个向量,而不是一个 <non-vector>
对象,其中我的非向量对象包括 <mcpfit>
和 <patchwork/gg/ggplot>
对象。看起来他们尝试在这里解决这个问题1,但在使用 tidyr v. 1.3.0 时仍然不起作用。
我无法轻松地从自己的用例中创建一个可重现的示例。但我将使用上面链接的 Github 问题中列出的示例,希望这也适用于我的用例。
library(tidyverse)
m <-
tibble::as_tibble(mtcars[1,]) %>%
mutate(ls_col=list(
list(
a=c(1:10),
b=lm(cyl~gear))
)
)
m2 <-
m %>%
unnest_wider(ls_col)
我要么寻找一个替代的 data.table
或基本的 R 解决方案,要么寻找一个 tidyverse
的解决方法(例如,从嵌套列表中删除非向量对象,然后使用 tidyr::unnest_wider()
)。tidyr::unnest()
似乎可以工作,但我不知道如何将包含列表的列转换成它们自己的列(每次尝试类似这样的操作时 R 都会崩溃)。
英文:
I have old code that used to work using tidyr::unnest_wider()
to unnest a nested named list into their own columns; however, it no longer works. Instead I get an error saying x$name_of_list
must be a vector, not a <non-vector>
object, where my non-vector objects include <mcpfit>
and <patchwork/gg/ggplot>
objects. It seems like they tried to address this issue here, but it still doesn't work using tidyr v. 1.3.0.
I couldn't easily create a reproducible example from my own use case. But I'll use the example listed in the Github issue link above in hopes that this will work for my use case as well.
library(tidyverse)
m <-
tibble::as_tibble(mtcars[1,]) %>%
mutate(ls_col=list(
list(
a=c(1:10),
b=lm(cyl~gear))
)
)
m2 <-
m %>%
unnest_wider(ls_col)
I am looking for EITHER an alternative data.table
or base R solution OR a tidyverse
workaround (e.g., remove the non-vector objects from the nested list and then use tidyr::unnest_wider()
). tidyr::unnest()
seems to work, but then I don't know how to pivot the column containing the lists into their own columns (R crashes every time I try something like this).
答案1
得分: 4
你可以指定 strict = TRUE
。
library(tidyverse)
m <- tibble::as_tibble(mtcars[1,]) %>%
mutate(ls_col= list(
list(
a=c(1:10),
b=lm(cyl~gear))
))
m %>%
unnest_wider(ls_col, strict = TRUE)
#> # A tibble: 1 x 13
#> mpg cyl disp hp drat wt qsec vs am gear carb a b
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <lm>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 <int> <lm>
为什么?
strict
参数默认为 FALSE
,在这种状态下,unnest_wider
将会将您列表中的零长度的类型化对象,如 numeric()
或 character()
,转换为 NA
,这可以帮助将具有零长度项目的列表转换为类型化列,例如:
m <- tibble(ls_col = list(list(a = character()), list(a = 1)))
m %>% unnest_wider(ls_col, strict = FALSE)
#> # A tibble: 2 x 1
#> a
#> <dbl>
#> 1 NA
#> 2 1
而对于 strict = TRUE
,类型会严格保留,这意味着在这种情况下我们最终会得到一个列表列:
m %>% unnest_wider(ls_col, strict = TRUE)
#> # A tibble: 2 x 1
#> a
#> <list>
#> 1 <chr [0]>
#> 2 <dbl [1]>
默认的 strict = FALSE
在某些情况下会很有用,因为它可以帮助重新排列包含一些空项目的复杂列表(例如解析某些 JSON 结构)。为了实现这一点,unnest_wider
使用了函数 vctrs::list_sizes
(通过非导出函数 elt_to_wide
),如果列表包含非向量项,它将引发错误:
vctrs:::list_sizes(list(a = 1, b = lm(cyl~gear, mtcars)))
#> Error in `vctrs:::list_sizes()`:
#> ! `x$b` must be a vector, not a <lm> object.
#> Run `rlang::last_trace()` to see where the error occurred.
我不会称这种行为为_错误_,但它有点不直观,感觉我们使用 strict = TRUE
的原因与其设计理念不符。但是,在这里它确实起作用。
在2023-08-04使用 reprex v2.0.2 创建
英文:
You can specify strict = TRUE
.
library(tidyverse)
m <- tibble::as_tibble(mtcars[1,]) %>%
mutate(ls_col= list(
list(
a=c(1:10),
b=lm(cyl~gear))
))
m %>%
unnest_wider(ls_col, strict = TRUE)
#> # A tibble: 1 x 13
#> mpg cyl disp hp drat wt qsec vs am gear carb a b
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <lis>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 <int> <lm>
Why?
The strict
argument defaults to FALSE
, and in this state, unnest_wider
will convert zero-length typed objects like numeric()
or character()
in your list to NA
, which can be helpful in converting lists with zero-length items into a typed column, for example:
m <- tibble(ls_col = list(list(a = character()), list(a = 1)))
m %>% unnest_wider(ls_col, strict = FALSE)
#> # A tibble: 2 x 1
#> a
#> <dbl>
#> 1 NA
#> 2 1
Whereas with strict = TRUE
, type is strictly preserved, which means in this case we end up with a list column:
m %>% unnest_wider(ls_col, strict = TRUE)
#> # A tibble: 2 x 1
#> a
#> <list>
#> 1 <chr [0]>
#> 2 <dbl [1]>
The default strict = FALSE
can come in handy in some circumstances, since it can help rearranging complex lists with some empty items (as in parsing certain json structures). To achieve this, unnest_wider
uses the function vctrs::list_sizes
, (via the non-exported function elt_to_wide
), which will throw an error if the list contains non-vector items:
vctrs:::list_sizes(list(a = 1, b = lm(cyl~gear, mtcars)))
#> Error in `vctrs:::list_sizes()`:
#> ! `x$b` must be a vector, not a <lm> object.
#> Run `rlang::last_trace()` to see where the error occurred.
I wouldn't call this behaviour a bug as such, but it's a bit unintuitive and feels like we are using strict = TRUE
for a reason other than its design rationale. However, it does work here.
<sup>Created on 2023-08-04 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论