如何使用dplyr合并具有不同行的多个数据框。

huangapple go评论105阅读模式
英文:

How to merge multiple dataframes with different rows using dplyr

问题

以下是代码部分的翻译:

  1. 我在R中有以下数据框,它们都具有不同数量的行和不同的日期。
  2. data1 <- structure(list(Date = structure(c(18628, 18629, 18630, 18631), class = "Date"),
  3. Value1 = c(1, 2, 3, 4)), row.names = c(NA, -4L), class = c("tbl_df",
  4. "tbl", "data.frame"))
  5. data2 <- structure(list(Date = structure(c(18628, 18632, 18633), class = "Date"),
  6. Value2 = c(1, 2, 3)), row.names = c(NA, -3L), class = c("tbl_df",
  7. "tbl", "data.frame"))
  8. data3 <- structure(list(Date = structure(c(18626, 18629, 18633, 18634,
  9. 18635), class = "Date"), Value3 = c(1, 2, 3, 4, 5)), row.names = c(NA,
  10. -5L), class = c("tbl_df", "tbl", "data.frame"))

我想将它们全部合并成一个数据框。通常情况下,我会使用full_join,但这仅适用于两个数据框,如下所示:

  1. library(tidyverse)
  2. data <- full_join(data1, data2, by = 'Date') %>%
  3. arrange(Date)

是否有一种简单的方法可以将超过2个数据框合并成一个数据框?

英文:

I have the following dataframes in R, which all have a different number of rows and different dates in them as well.

  1. data1 &lt;- structure(list(Date = structure(c(18628, 18629, 18630, 18631), class = &quot;Date&quot;),
  2. Value1 = c(1, 2, 3, 4)), row.names = c(NA, -4L), class = c(&quot;tbl_df&quot;,
  3. &quot;tbl&quot;, &quot;data.frame&quot;))
  4. data2 &lt;- structure(list(Date = structure(c(18628, 18632, 18633), class = &quot;Date&quot;),
  5. Value2 = c(1, 2, 3)), row.names = c(NA, -3L), class = c(&quot;tbl_df&quot;,
  6. &quot;tbl&quot;, &quot;data.frame&quot;))
  7. data3 &lt;- structure(list(Date = structure(c(18626, 18629, 18633, 18634,
  8. 18635), class = &quot;Date&quot;), Value3 = c(1, 2, 3, 4, 5)), row.names = c(NA,
  9. -5L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

I would like to combine all three of them into one dataframe. Usually, I would use full_join for this but that's only possible for two dataframes, like this

  1. library(tidyverse)
  2. data &lt;- full_join(data1, data2, by = &#39;Date&#39;) %&gt;%
  3. arrange(Date)

Is there a simple way in which I can merge more than 2 dataframes into one dataframe?

答案1

得分: 2

将你的数据框收集到一个列表中并使用reduce函数:

  1. mget(ls(pattern = "^data")) %>%
  2. reduce(full_join, by = "Date")

一个数据表:9行 × 4列

日期 值1 值2 值3
1 2021-01-01 1 1 NA
2 2021-01-02 2 NA 2
3 2021-01-03 3 NA NA
4 2021-01-04 4 NA NA
5 2021-01-05 NA 2 NA
6 2021-01-06 NA 3 3
7 2020-12-30 NA NA 1
8 2021-01-07 NA NA 4
9 2021-01-08 NA NA 5

  1. <details>
  2. <summary>英文:</summary>
  3. Collect your data frames in a list and use `reduce`:
  4. ```r
  5. mget(ls(pattern = &quot;^data&quot;)) %&gt;%
  6. reduce(full_join, by = &quot;Date&quot;)
  7. # A tibble: 9 &#215; 4
  8. Date Value1 Value2 Value3
  9. &lt;date&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  10. 1 2021-01-01 1 1 NA
  11. 2 2021-01-02 2 NA 2
  12. 3 2021-01-03 3 NA NA
  13. 4 2021-01-04 4 NA NA
  14. 5 2021-01-05 NA 2 NA
  15. 6 2021-01-06 NA 3 3
  16. 7 2020-12-30 NA NA 1
  17. 8 2021-01-07 NA NA 4
  18. 9 2021-01-08 NA NA 5

答案2

得分: 1

使用 base R

  1. Reduce(function(...) merge(..., all = TRUE), mget(ls(pattern = &quot;^data\\d+$&quot;)))

-输出

  1. Date Value1 Value2 Value3
  2. 1 2020-12-30 NA NA 1
  3. 2 2021-01-01 1 1 NA
  4. 3 2021-01-02 2 NA 2
  5. 4 2021-01-03 3 NA NA
  6. 5 2021-01-04 4 NA NA
  7. 6 2021-01-05 NA 2 NA
  8. 7 2021-01-06 NA 3 3
  9. 8 2021-01-07 NA NA 4
  10. 9 2021-01-08 NA NA 5

或者使用 plyr::join_all

  1. plyr::join_all(mget(ls(pattern = &quot;^data\\d+$&quot;)), type = &quot;full&quot;)
  2. Date Value1 Value2 Value3
  3. 1 2021-01-01 1 1 NA
  4. 2 2021-01-02 2 NA 2
  5. 3 2021-01-03 3 NA NA
  6. 4 2021-01-04 4 NA NA
  7. 5 2021-01-05 NA 2 NA
  8. 6 2021-01-06 NA 3 3
  9. 7 2020-12-30 NA NA 1
  10. 8 2021-01-07 NA NA 4
  11. 9 2021-01-08 NA NA 5
英文:

Using base R

  1. Reduce(function(...) merge(..., all = TRUE), mget(ls(pattern = &quot;^data\\d+$&quot;)))

-output

  1. Date Value1 Value2 Value3
  2. 1 2020-12-30 NA NA 1
  3. 2 2021-01-01 1 1 NA
  4. 3 2021-01-02 2 NA 2
  5. 4 2021-01-03 3 NA NA
  6. 5 2021-01-04 4 NA NA
  7. 6 2021-01-05 NA 2 NA
  8. 7 2021-01-06 NA 3 3
  9. 8 2021-01-07 NA NA 4
  10. 9 2021-01-08 NA NA 5

Or with plyr::join_all

  1. plyr::join_all(mget(ls(pattern = &quot;^data\\d+$&quot;)), type = &quot;full&quot;)
  2. Date Value1 Value2 Value3
  3. 1 2021-01-01 1 1 NA
  4. 2 2021-01-02 2 NA 2
  5. 3 2021-01-03 3 NA NA
  6. 4 2021-01-04 4 NA NA
  7. 5 2021-01-05 NA 2 NA
  8. 6 2021-01-06 NA 3 3
  9. 7 2020-12-30 NA NA 1
  10. 8 2021-01-07 NA NA 4
  11. 9 2021-01-08 NA NA 5

答案3

得分: 0

你可以很简单地在基本的 R 中完成它:

  1. group_data <- rbind.data.frame(data1, data2, data3)
  2. group_data
英文:

You can do it in base R quite simply:

  1. group_data&lt;-rbind.data.frame(data1,data2,data3)
  2. group_data

答案4

得分: 0

  1. full_join(data1, data2) %>%
  2. full_join(., data3)
  3. Joining, by = "Date"
  4. # A tibble: 9 × 4
  5. Date Value1 Value2 Value3
  6. <date> <dbl> <dbl> <dbl>
  7. 1 2021-01-01 1 1 NA
  8. 2 2021-01-02 2 NA 2
  9. 3 2021-01-03 3 NA NA
  10. 4 2021-01-04 4 NA NA
  11. 5 2021-01-05 NA 2 NA
  12. 6 2021-01-06 NA 3 3
  13. 7 2020-12-30 NA NA 1
  14. 8 2021-01-07 NA NA 4
  15. 9 2021-01-08 NA NA 5
英文:
  1. full_join(data1, data2) %&gt;%
  2. full_join(., data3)
  3. Joining, by = &quot;Date&quot;
  4. # A tibble: 9 &#215; 4
  5. Date Value1 Value2 Value3
  6. &lt;date&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  7. 1 2021-01-01 1 1 NA
  8. 2 2021-01-02 2 NA 2
  9. 3 2021-01-03 3 NA NA
  10. 4 2021-01-04 4 NA NA
  11. 5 2021-01-05 NA 2 NA
  12. 6 2021-01-06 NA 3 3
  13. 7 2020-12-30 NA NA 1
  14. 8 2021-01-07 NA NA 4
  15. 9 2021-01-08 NA NA 5

huangapple
  • 本文由 发表于 2023年3月15日 19:55:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/75744370.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定