英文:
How to convert list of lists of key value pairs to dataframe?
问题
注意,我正在处理的输入形状与类似问题中的不同:
https://stackoverflow.com/questions/29674661/r-list-of-lists-to-data-frame; https://stackoverflow.com/questions/59928743/converting-a-list-of-lists-to-a-dataframe-in-r-the-tidyverse-way
我正在使用返回以下形状的Web API 工作
listOfListsOfKeyVals <- lapply(1:5, function(i){
list(
col1 = i,
col2 = runif(1)
)
})
你可能认为以下方法会起作用
do.call(rbind, listOfListsOfKeyVals)
但仔细检查后,结果实际上是一个列表的数据框
do.call(rbind, listOfListsOfKeyVals) |> tibble()
# A tibble: 5 × 1
`do.call(rbind, listOfListsOfKeyVals)`[,"col1"] [,"col2"]
<list> <list>
1 <int [1]> <dbl [1]>
2 <int [1]> <dbl [1]>
3 <int [1]> <dbl [1]>
4 <int [1]> <dbl [1]>
5 <int [1]> <dbl [1]>
我提出了以下解决方案
foreach(x = listOfListsOfKeyVals, .combine = rbind) %do% {
as.data.frame(x)
} |> tibble()
但对于大型数据集来说速度非常慢。有更好的方法吗?
英文:
Note the input shape I am dealing with is not the same as in these similar questions:
https://stackoverflow.com/questions/29674661/r-list-of-lists-to-data-frame; https://stackoverflow.com/questions/59928743/converting-a-list-of-lists-to-a-dataframe-in-r-the-tidyverse-way
I am working with a web api which returns results in this shape
listOfListsOfKeyVals <- lapply(1:5, function(i){
list(
col1 = i,
col2 = runif(1)
)
})
You might think that the following will work
do.call(rbind, listOfListsOfKeyVals)
But on closer inspection the result is actually a dataframe of lists
do.call(rbind, listOfListsOfKeyVals) |> tibble()
# A tibble: 5 × 1
`do.call(rbind, listOfListsOfKeyVals)`[,"col1"] [,"col2"]
<list> <list>
1 <int [1]> <dbl [1]>
2 <int [1]> <dbl [1]>
3 <int [1]> <dbl [1]>
4 <int [1]> <dbl [1]>
5 <int [1]> <dbl [1]>
I have come up with the following solution
foreach(x = listOfListsOfKeyVals, .combine = rbind) %do% {
as.data.frame(x)
} |> tibble()
But it is painfully slow for large data sets. Is there a better way?
答案1
得分: 4
我认为你正在寻找 dplyr::bind_rows
。
library(dplyr)
set.seed(12)
listOfListsOfKeyVals <- lapply(1:5, function(i){
list(
col1 = i,
col2 = runif(1)
)
})
bind_rows(listOfListsOfKeyVals)
输出:
# A tibble: 5 × 2
col1 col2
<int> <dbl>
1 1 0.0694
2 2 0.818
3 3 0.943
4 4 0.269
5 5 0.169
英文:
I think you are looking for dplyr::bind_rows
.
library(dplyr)
set.seed(12)
listOfListsOfKeyVals <- lapply(1:5, function(i){
list(
col1 = i,
col2 = runif(1)
)
})
bind_rows(listOfListsOfKeyVals)
Output:
# A tibble: 5 × 2
col1 col2
<int> <dbl>
1 1 0.0694
2 2 0.818
3 3 0.943
4 4 0.269
5 5 0.169
答案2
得分: 1
另一个选项是 rrapply
:
rrapply::rrapply(listOfListsOfKeyVals, how = "bind")
# col1 col2
# 1 1 0.21794909
# 2 2 0.81600287
# 3 3 0.04631368
# 4 4 0.10518273
# 5 5 0.46489659
还有一个基础的 R 选项,使用 matrix
:
l <- listOfListsOfKeyVals
data.frame(matrix(unlist(l), nrow = length(l), byrow = TRUE)) ->
setNames(names(l[[1]]))
英文:
Another option is rrapply
:
rrapply::rrapply(listOfListsOfKeyVals, how = "bind")
# col1 col2
# 1 1 0.21794909
# 2 2 0.81600287
# 3 3 0.04631368
# 4 4 0.10518273
# 5 5 0.46489659
And a base R option with matrix
:
l <- listOfListsOfKeyVals
data.frame(matrix(unlist(l), nrow = length(l), byrow = TRUE)) |>
setNames(names(l[[1]]))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论