英文:
reduce list of multiple data.frames into one data.frame with unequal rows
问题
我有多个具有两列的数据框,它们都共享第一列。我将它们放在一个列表中,并希望使用dplyr::bind_cols将它们组合成一个数据框。
但是,这是不可能的,因为它们的行数不相等。我没有改变数据集结构的机会。我如何能够使用dplyr来连接这些数据框?
我尝试了bind_rows和full_join,但都不起作用。
示例:
a <- data.frame(a = rep(1:9), col_a1 = "a")
b <- data.frame(a = rep(1:9), col_b = "b")
c <- data.frame(a = rep(1:8), col_c = "a")
data_list <- list(a, b, c)
data_all <- reduce(data_list, bind_cols)
# Error in `fn()`:
# ! Can't recycle `..1` (size 9) to match `..2` (size 8).
# Run `rlang::last_trace()` to see where the error occurred.
data_all <- reduce(data_list, full_join(by = "a"))
#Wanted output:
# data_all
# a col_a1 col_b col_c
# 1 1 a b a
# 2 2 a b a
# 3 3 a b a
# 4 4 a b a
# 5 5 a b a
# 6 6 a b a
# 7 7 a b a
# 8 8 a b a
# 9 9 a b <NA>
我很高兴接受任何建议。请注意,在实际情况下,我在开始时有数百个数据框在列表中,所以我不能手动输入。
我还尝试了reduce(bind_rows),然后想使用pivot_wider,但是列名消失了。
谢谢!
英文:
I have multiple data.frames with two column, all of them share the first column. I get them in a list and want to combine them into one data.frame using dplyr::bind_cols
However this is not possible, as they have unequal rows. I don't have the chance to change the structure of the dataset. How am I able to join the data.frames using dplyr?
I tried bind_rows and full_join, but both don't work.
Example:
a <- data.frame(a = rep(1:9), col_a1 = "a")
b <- data.frame(a = rep(1:9), col_b = "b")
c <- data.frame(a = rep(1:8), col_c = "a")
data_list <- list(a, b, c)
data_all <- reduce(data_list, bind_cols)
# Error in `fn()`:
# ! Can't recycle `..1` (size 9) to match `..2` (size 8).
# Run `rlang::last_trace()` to see where the error occurred.
data_all <- reduce(data_list, full_join(by = "a"))
#Wanted output:
data_all
a col_a1 col_b col_c
1 1 a b a
2 2 a b a
3 3 a b a
4 4 a b a
5 5 a b a
6 6 a b a
7 7 a b a
8 8 a b a
9 9 a b <NA>
I am happy for every advice. Note that in reality, I have hundreds of data.frames in the list at the beginning, so I cannot type in manually.
I also tried reduce(bind_rows), with the idea of pivot_wider afterwards, but then the column names are gone.
Thank you!
答案1
得分: 2
library(tidyverse)
reduce(data_list, left_join)
# 一个数据框: 9 × 4
a col_a1 col_b col_c
<int> <chr> <chr> <chr>
1 1 a b a
2 2 a b a
3 3 a b a
4 4 a b a
5 5 a b a
6 6 a b a
7 7 a b a
8 8 a b a
9 9 a b <NA>
英文:
library(tidyverse)
reduce(data_list, left_join)
# A tibble: 9 × 4
a col_a1 col_b col_c
<int> <chr> <chr> <chr>
1 1 a b a
2 2 a b a
3 3 a b a
4 4 a b a
5 5 a b a
6 6 a b a
7 7 a b a
8 8 a b a
9 9 a b NA
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论