如何将键值对的列表列表转换为数据框?

huangapple go评论103阅读模式
英文:

How to convert list of lists of key value pairs to dataframe?

问题

注意,我正在处理的输入形状与类似问题中的不同:
https://stackoverflow.com/questions/29674661/r-list-of-lists-to-data-frame; https://stackoverflow.com/questions/59928743/converting-a-list-of-lists-to-a-dataframe-in-r-the-tidyverse-way

我正在使用返回以下形状的Web API 工作

  1. listOfListsOfKeyVals <- lapply(1:5, function(i){
  2. list(
  3. col1 = i,
  4. col2 = runif(1)
  5. )
  6. })

你可能认为以下方法会起作用

  1. do.call(rbind, listOfListsOfKeyVals)

但仔细检查后,结果实际上是一个列表的数据框

  1. do.call(rbind, listOfListsOfKeyVals) |> tibble()
  2. # A tibble: 5 × 1
  3. `do.call(rbind, listOfListsOfKeyVals)`[,"col1"] [,"col2"]
  4. <list> <list>
  5. 1 <int [1]> <dbl [1]>
  6. 2 <int [1]> <dbl [1]>
  7. 3 <int [1]> <dbl [1]>
  8. 4 <int [1]> <dbl [1]>
  9. 5 <int [1]> <dbl [1]>

我提出了以下解决方案

  1. foreach(x = listOfListsOfKeyVals, .combine = rbind) %do% {
  2. as.data.frame(x)
  3. } |> tibble()

但对于大型数据集来说速度非常慢。有更好的方法吗?

英文:

Note the input shape I am dealing with is not the same as in these similar questions:
https://stackoverflow.com/questions/29674661/r-list-of-lists-to-data-frame; https://stackoverflow.com/questions/59928743/converting-a-list-of-lists-to-a-dataframe-in-r-the-tidyverse-way

I am working with a web api which returns results in this shape

  1. listOfListsOfKeyVals <- lapply(1:5, function(i){
  2. list(
  3. col1 = i,
  4. col2 = runif(1)
  5. )
  6. })

You might think that the following will work

  1. do.call(rbind, listOfListsOfKeyVals)

But on closer inspection the result is actually a dataframe of lists

  1. do.call(rbind, listOfListsOfKeyVals) |> tibble()
  2. # A tibble: 5 × 1
  3. `do.call(rbind, listOfListsOfKeyVals)`[,"col1"] [,"col2"]
  4. <list> <list>
  5. 1 <int [1]> <dbl [1]>
  6. 2 <int [1]> <dbl [1]>
  7. 3 <int [1]> <dbl [1]>
  8. 4 <int [1]> <dbl [1]>
  9. 5 <int [1]> <dbl [1]>

I have come up with the following solution

  1. foreach(x = listOfListsOfKeyVals, .combine = rbind) %do% {
  2. as.data.frame(x)
  3. } |> tibble()

But it is painfully slow for large data sets. Is there a better way?

答案1

得分: 4

我认为你正在寻找 dplyr::bind_rows

  1. library(dplyr)
  2. set.seed(12)
  3. listOfListsOfKeyVals <- lapply(1:5, function(i){
  4. list(
  5. col1 = i,
  6. col2 = runif(1)
  7. )
  8. })
  9. bind_rows(listOfListsOfKeyVals)

输出:

  1. # A tibble: 5 × 2
  2. col1 col2
  3. <int> <dbl>
  4. 1 1 0.0694
  5. 2 2 0.818
  6. 3 3 0.943
  7. 4 4 0.269
  8. 5 5 0.169
英文:

I think you are looking for dplyr::bind_rows.

  1. library(dplyr)
  2. set.seed(12)
  3. listOfListsOfKeyVals &lt;- lapply(1:5, function(i){
  4. list(
  5. col1 = i,
  6. col2 = runif(1)
  7. )
  8. })
  9. bind_rows(listOfListsOfKeyVals)

Output:

  1. # A tibble: 5 &#215; 2
  2. col1 col2
  3. &lt;int&gt; &lt;dbl&gt;
  4. 1 1 0.0694
  5. 2 2 0.818
  6. 3 3 0.943
  7. 4 4 0.269
  8. 5 5 0.169

答案2

得分: 1

另一个选项是 rrapply

  1. rrapply::rrapply(listOfListsOfKeyVals, how = "bind")
  2. # col1 col2
  3. # 1 1 0.21794909
  4. # 2 2 0.81600287
  5. # 3 3 0.04631368
  6. # 4 4 0.10518273
  7. # 5 5 0.46489659

还有一个基础的 R 选项,使用 matrix

  1. l <- listOfListsOfKeyVals
  2. data.frame(matrix(unlist(l), nrow = length(l), byrow = TRUE)) ->
  3. setNames(names(l[[1]]))
英文:

Another option is rrapply:

  1. rrapply::rrapply(listOfListsOfKeyVals, how = &quot;bind&quot;)
  2. # col1 col2
  3. # 1 1 0.21794909
  4. # 2 2 0.81600287
  5. # 3 3 0.04631368
  6. # 4 4 0.10518273
  7. # 5 5 0.46489659

And a base R option with matrix:

  1. l &lt;- listOfListsOfKeyVals
  2. data.frame(matrix(unlist(l), nrow = length(l), byrow = TRUE)) |&gt;
  3. setNames(names(l[[1]]))

huangapple
  • 本文由 发表于 2023年6月8日 13:47:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76428910.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定