如何在单个数据集中将多个列进行 “left_join” 合并为一列?

huangapple go评论69阅读模式
英文:

How to "left_join" multiple columns into one column within a single dataset?

问题

我有一个包含三列的数据框。我想要用另一列中的值来填充其中一个列中的缺失值,但我不想覆盖任何数据。我应该如何获得以下结果?

# 初始数据框:
DF$ST_1 <- c(100, NA, 100, 100, 200, 200, NA, NA, NA, NA, 200)
DF$ST_2 <- c(50, NA, 50, 50, 12, NA, NA, 50, 50, NA, 12)
DF$ST_3 <- c(5, NA, 5, 2, 3, 1, 1, 3, 4, 2, 11)

我想要的结果:
DF$ST <- c(100, NA, 100, 100, 200, 200, 1, 50, 50, 2, 200)

如您所见,我想保留ST_1中的所有值,当出现NA时,用ST_2中的值填充它。然后,我想保留所有这些合并后的值,并用ST_3中的值填充剩下的NA。在所有这些合并后,仍然会有一些剩下的NA。

英文:

I have a dataframe with three columns. I would like to populate the NAs that are in one column with values in another column, but I do not want to overwrite any data. How can I get the following results?

# Starting Dataframe:
DF$ST_1 &lt;- c(100, NA, 100, 100, 200, 200, NA, NA, NA, NA, 200)
DF$ST_2 &lt;- c(50,  NA,  50,  50,  12,  NA, NA, 50, 50, NA, 12)
DF$ST_3 &lt;- c(5,   NA,   5,   2,   3,   1,  1,  3,  4,  2, 11)


Results I want:
DF$ST &lt;- c(100, NA,  100, 100, 200, 200, 1, 50, 50, 2, 200)

As you can see, I want to keep all the values in ST_1, and when there is an NA, fill it in with ST_2. Then, I want to keep all of the values from that merge, and fill in the remaining NAs with ST_3. There will still be some leftover NAs after all these merges.

答案1

得分: 1

library(dplyr)
DF %>%
  mutate(ST=coalesce(ST_1,ST_2,ST_3))

   ST_1 ST_2 ST_3  ST
1   100   50    5 100
2    NA   NA   NA  NA
3   100   50    5 100
4   100   50    2 100
5   200   12    3 200
6   200   NA    1 200
7    NA   NA    1   1
8    NA   50    3  50
9    NA   50    4  50
10   NA   NA    2   2
11  200   12   11 200
英文:
library(dplyr)
DF %&gt;%
  mutate(ST=coalesce(ST_1,ST_2,ST_3))

   ST_1 ST_2 ST_3  ST
1   100   50    5 100
2    NA   NA   NA  NA
3   100   50    5 100
4   100   50    2 100
5   200   12    3 200
6   200   NA    1 200
7    NA   NA    1   1
8    NA   50    3  50
9    NA   50    4  50
10   NA   NA    2   2
11  200   12   11 200

答案2

得分: 0

你想要每行中的最大值吗?

基本 R 代码:

DF$ST <- apply(DF, 1, max, na.rm = TRUE)
英文:

So you're wanting the max value from each row?

base R:

DF$ST &lt;- apply(DF,1,max,na.rm=TRUE)

huangapple
  • 本文由 发表于 2023年6月29日 23:42:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76582621.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定