将多个数据框的列表减少为一个具有不等行数的数据框。

huangapple go评论61阅读模式
英文:

reduce list of multiple data.frames into one data.frame with unequal rows

问题

我有多个具有两列的数据框,它们都共享第一列。我将它们放在一个列表中,并希望使用dplyr::bind_cols将它们组合成一个数据框。

但是,这是不可能的,因为它们的行数不相等。我没有改变数据集结构的机会。我如何能够使用dplyr来连接这些数据框?

我尝试了bind_rows和full_join,但都不起作用。

示例:

a <- data.frame(a = rep(1:9), col_a1 = "a")
b <- data.frame(a = rep(1:9), col_b = "b")
c <- data.frame(a = rep(1:8), col_c = "a")

data_list <- list(a, b, c)

data_all <- reduce(data_list, bind_cols)
# Error in `fn()`:
#   ! Can't recycle `..1` (size 9) to match `..2` (size 8).
# Run `rlang::last_trace()` to see where the error occurred.

data_all <- reduce(data_list, full_join(by = "a"))

#Wanted output:   
#    data_all
#    a col_a1 col_b col_c
#    1 1      a     b     a
#    2 2      a     b     a
#    3 3      a     b     a
#    4 4      a     b     a
#    5 5      a     b     a
#    6 6      a     b     a
#    7 7      a     b     a
#    8 8      a     b     a
#    9 9      a     b  <NA>

我很高兴接受任何建议。请注意,在实际情况下,我在开始时有数百个数据框在列表中,所以我不能手动输入。

我还尝试了reduce(bind_rows),然后想使用pivot_wider,但是列名消失了。

谢谢!

英文:

I have multiple data.frames with two column, all of them share the first column. I get them in a list and want to combine them into one data.frame using dplyr::bind_cols

However this is not possible, as they have unequal rows. I don't have the chance to change the structure of the dataset. How am I able to join the data.frames using dplyr?

I tried bind_rows and full_join, but both don't work.

Example:

a &lt;- data.frame(a = rep(1:9), col_a1 = &quot;a&quot;)
b &lt;- data.frame(a = rep(1:9), col_b = &quot;b&quot;)
c &lt;- data.frame(a = rep(1:8), col_c = &quot;a&quot;)

data_list &lt;- list(a, b, c)

data_all &lt;- reduce(data_list, bind_cols)
# Error in `fn()`:
#   ! Can&#39;t recycle `..1` (size 9) to match `..2` (size 8).
# Run `rlang::last_trace()` to see where the error occurred.

data_all &lt;- reduce(data_list, full_join(by = &quot;a&quot;))

#Wanted output:   
    data_all
    a col_a1 col_b col_c
    1 1      a     b     a
    2 2      a     b     a
    3 3      a     b     a
    4 4      a     b     a
    5 5      a     b     a
    6 6      a     b     a
    7 7      a     b     a
    8 8      a     b     a
    9 9      a     b  &lt;NA&gt;

I am happy for every advice. Note that in reality, I have hundreds of data.frames in the list at the beginning, so I cannot type in manually.

I also tried reduce(bind_rows), with the idea of pivot_wider afterwards, but then the column names are gone.

Thank you!

答案1

得分: 2

library(tidyverse)

reduce(data_list, left_join)

# 一个数据框: 9 × 4
      a col_a1 col_b col_c
  <int> <chr>  <chr> <chr>
1     1     a     b     a
2     2     a     b     a
3     3     a     b     a
4     4     a     b     a
5     5     a     b     a
6     6     a     b     a
7     7     a     b     a
8     8     a     b     a
9     9     a     b  <NA>
英文:
library(tidyverse)

reduce(data_list, left_join)

# A tibble: 9 &#215; 4
      a col_a1 col_b col_c
  &lt;int&gt; &lt;chr&gt;  &lt;chr&gt; &lt;chr&gt;
1     1 a      b     a    
2     2 a      b     a    
3     3 a      b     a    
4     4 a      b     a    
5     5 a      b     a    
6     6 a      b     a    
7     7 a      b     a    
8     8 a      b     a    
9     9 a      b     NA  

huangapple
  • 本文由 发表于 2023年6月29日 19:58:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76580838.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定