如何在单个数据集中将多个列进行 “left_join” 合并为一列?

huangapple go评论93阅读模式
英文:

How to "left_join" multiple columns into one column within a single dataset?

问题

我有一个包含三列的数据框。我想要用另一列中的值来填充其中一个列中的缺失值,但我不想覆盖任何数据。我应该如何获得以下结果?

  1. # 初始数据框:
  2. DF$ST_1 <- c(100, NA, 100, 100, 200, 200, NA, NA, NA, NA, 200)
  3. DF$ST_2 <- c(50, NA, 50, 50, 12, NA, NA, 50, 50, NA, 12)
  4. DF$ST_3 <- c(5, NA, 5, 2, 3, 1, 1, 3, 4, 2, 11)
  5. 我想要的结果:
  6. DF$ST <- c(100, NA, 100, 100, 200, 200, 1, 50, 50, 2, 200)

如您所见,我想保留ST_1中的所有值,当出现NA时,用ST_2中的值填充它。然后,我想保留所有这些合并后的值,并用ST_3中的值填充剩下的NA。在所有这些合并后,仍然会有一些剩下的NA。

英文:

I have a dataframe with three columns. I would like to populate the NAs that are in one column with values in another column, but I do not want to overwrite any data. How can I get the following results?

  1. # Starting Dataframe:
  2. DF$ST_1 &lt;- c(100, NA, 100, 100, 200, 200, NA, NA, NA, NA, 200)
  3. DF$ST_2 &lt;- c(50, NA, 50, 50, 12, NA, NA, 50, 50, NA, 12)
  4. DF$ST_3 &lt;- c(5, NA, 5, 2, 3, 1, 1, 3, 4, 2, 11)
  5. Results I want:
  6. DF$ST &lt;- c(100, NA, 100, 100, 200, 200, 1, 50, 50, 2, 200)

As you can see, I want to keep all the values in ST_1, and when there is an NA, fill it in with ST_2. Then, I want to keep all of the values from that merge, and fill in the remaining NAs with ST_3. There will still be some leftover NAs after all these merges.

答案1

得分: 1

  1. library(dplyr)
  2. DF %>%
  3. mutate(ST=coalesce(ST_1,ST_2,ST_3))
  4. ST_1 ST_2 ST_3 ST
  5. 1 100 50 5 100
  6. 2 NA NA NA NA
  7. 3 100 50 5 100
  8. 4 100 50 2 100
  9. 5 200 12 3 200
  10. 6 200 NA 1 200
  11. 7 NA NA 1 1
  12. 8 NA 50 3 50
  13. 9 NA 50 4 50
  14. 10 NA NA 2 2
  15. 11 200 12 11 200
英文:
  1. library(dplyr)
  2. DF %&gt;%
  3. mutate(ST=coalesce(ST_1,ST_2,ST_3))
  4. ST_1 ST_2 ST_3 ST
  5. 1 100 50 5 100
  6. 2 NA NA NA NA
  7. 3 100 50 5 100
  8. 4 100 50 2 100
  9. 5 200 12 3 200
  10. 6 200 NA 1 200
  11. 7 NA NA 1 1
  12. 8 NA 50 3 50
  13. 9 NA 50 4 50
  14. 10 NA NA 2 2
  15. 11 200 12 11 200

答案2

得分: 0

你想要每行中的最大值吗?

基本 R 代码:

  1. DF$ST <- apply(DF, 1, max, na.rm = TRUE)
英文:

So you're wanting the max value from each row?

base R:

  1. DF$ST &lt;- apply(DF,1,max,na.rm=TRUE)

huangapple
  • 本文由 发表于 2023年6月29日 23:42:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76582621.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定