将多个数据框的列表减少为一个具有不等行数的数据框。

huangapple go评论97阅读模式
英文:

reduce list of multiple data.frames into one data.frame with unequal rows

问题

我有多个具有两列的数据框,它们都共享第一列。我将它们放在一个列表中,并希望使用dplyr::bind_cols将它们组合成一个数据框。

但是,这是不可能的,因为它们的行数不相等。我没有改变数据集结构的机会。我如何能够使用dplyr来连接这些数据框?

我尝试了bind_rows和full_join,但都不起作用。

示例:

  1. a <- data.frame(a = rep(1:9), col_a1 = "a")
  2. b <- data.frame(a = rep(1:9), col_b = "b")
  3. c <- data.frame(a = rep(1:8), col_c = "a")
  4. data_list <- list(a, b, c)
  5. data_all <- reduce(data_list, bind_cols)
  6. # Error in `fn()`:
  7. # ! Can't recycle `..1` (size 9) to match `..2` (size 8).
  8. # Run `rlang::last_trace()` to see where the error occurred.
  9. data_all <- reduce(data_list, full_join(by = "a"))
  10. #Wanted output:
  11. # data_all
  12. # a col_a1 col_b col_c
  13. # 1 1 a b a
  14. # 2 2 a b a
  15. # 3 3 a b a
  16. # 4 4 a b a
  17. # 5 5 a b a
  18. # 6 6 a b a
  19. # 7 7 a b a
  20. # 8 8 a b a
  21. # 9 9 a b <NA>

我很高兴接受任何建议。请注意,在实际情况下,我在开始时有数百个数据框在列表中,所以我不能手动输入。

我还尝试了reduce(bind_rows),然后想使用pivot_wider,但是列名消失了。

谢谢!

英文:

I have multiple data.frames with two column, all of them share the first column. I get them in a list and want to combine them into one data.frame using dplyr::bind_cols

However this is not possible, as they have unequal rows. I don't have the chance to change the structure of the dataset. How am I able to join the data.frames using dplyr?

I tried bind_rows and full_join, but both don't work.

Example:

  1. a &lt;- data.frame(a = rep(1:9), col_a1 = &quot;a&quot;)
  2. b &lt;- data.frame(a = rep(1:9), col_b = &quot;b&quot;)
  3. c &lt;- data.frame(a = rep(1:8), col_c = &quot;a&quot;)
  4. data_list &lt;- list(a, b, c)
  5. data_all &lt;- reduce(data_list, bind_cols)
  6. # Error in `fn()`:
  7. # ! Can&#39;t recycle `..1` (size 9) to match `..2` (size 8).
  8. # Run `rlang::last_trace()` to see where the error occurred.
  9. data_all &lt;- reduce(data_list, full_join(by = &quot;a&quot;))
  10. #Wanted output:
  11. data_all
  12. a col_a1 col_b col_c
  13. 1 1 a b a
  14. 2 2 a b a
  15. 3 3 a b a
  16. 4 4 a b a
  17. 5 5 a b a
  18. 6 6 a b a
  19. 7 7 a b a
  20. 8 8 a b a
  21. 9 9 a b &lt;NA&gt;

I am happy for every advice. Note that in reality, I have hundreds of data.frames in the list at the beginning, so I cannot type in manually.

I also tried reduce(bind_rows), with the idea of pivot_wider afterwards, but then the column names are gone.

Thank you!

答案1

得分: 2

  1. library(tidyverse)
  2. reduce(data_list, left_join)
  3. # 一个数据框: 9 × 4
  4. a col_a1 col_b col_c
  5. <int> <chr> <chr> <chr>
  6. 1 1 a b a
  7. 2 2 a b a
  8. 3 3 a b a
  9. 4 4 a b a
  10. 5 5 a b a
  11. 6 6 a b a
  12. 7 7 a b a
  13. 8 8 a b a
  14. 9 9 a b <NA>
英文:
  1. library(tidyverse)
  2. reduce(data_list, left_join)
  3. # A tibble: 9 &#215; 4
  4. a col_a1 col_b col_c
  5. &lt;int&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  6. 1 1 a b a
  7. 2 2 a b a
  8. 3 3 a b a
  9. 4 4 a b a
  10. 5 5 a b a
  11. 6 6 a b a
  12. 7 7 a b a
  13. 8 8 a b a
  14. 9 9 a b NA

huangapple
  • 本文由 发表于 2023年6月29日 19:58:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76580838.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定