在R中有条件地替换匹配值列表的列数值。

huangapple go评论161阅读模式
英文:

Replacing column values that conditionally match a list of values in R

问题

我尝试在一个数据框中替换数值,当它与一个远小于其大小的第二个数据框中的标识符匹配时。下面是我尝试的一个示例:

  1. df1 = data.frame(row = seq(1,6),
  2. x = c("a","b","c","d","e","f"))
  3. df2 = data.frame(row = c(5,3,1,15,10),
  4. x2 = c("g","h","i","j","k"))
  5. df3 = df1 %>% mutate(x = case_when(
  6. df1$row == df2$row ~ df2$x2,
  7. .default = df1$x
  8. ))

我试图实现这个操作,即当 df1$row 与 df2$row 匹配时,用 df2$x2 中的值替换 df1$x,否则保留 df1$x。预期输出如下:

  1. df3
  2. row x
  3. 1 1 i
  4. 2 2 b
  5. 3 3 h
  6. 4 4 d
  7. 5 5 g
  8. 6 6 f

感谢任何帮助。

英文:

I am trying to replace values in one dataframe when it matches an identifier in a second dataframe of a much smaller size. A toy example of what I've tried:

  1. df1 = data.frame(row = seq(1,6),
  2. x = c("a","b","c","d","e","f"))
  3. df2 = data.frame(row = c(5,3,1,15,10),
  4. x2 = c("g","h","i","j","k"))
  5. df3 = df1 %>% mutate(x = case_when(
  6. df1$row == df2$row ~ df2$x2,
  7. .default = df1$x
  8. ))

I am attempting this to read, when df1$row matches df2$row, replace df1$x with the value from df2$x2 and otherwise leave df1$x. The expected output:

  1. df3
  2. row x
  3. 1 1 i
  4. 2 2 b
  5. 3 3 h
  6. 4 4 d
  7. 5 5 g
  8. 6 6 f

Any help appreciated.

答案1

得分: 1

我们可以通过row进行join,然后使用coalesce

  1. library(dplyr)
  2. df1 %>%
  3. left_join(df2, by = 'row') %>%
  4. mutate(x = coalesce(x2, x), .keep = 'unused')

row x
1 1 i
2 2 b
3 3 h
4 4 d
5 5 g
6 6 f

英文:

We can join by row, then use coalesce:

  1. library(dplyr)
  2. df1 %>%
  3. left_join(df2, by = 'row') %>%
  4. mutate(x = coalesce(x2, x), .keep = 'unused')
  5. row x
  6. 1 1 i
  7. 2 2 b
  8. 3 3 h
  9. 4 4 d
  10. 5 5 g
  11. 6 6 f
  12. </details>
  13. # 答案2
  14. **得分**: 1
  15. 我们可以使用 {powerjoin}
  16. ``` r
  17. df1 = data.frame(row = seq(1,6),
  18. x = c("a","b","c","d","e","f"))
  19. df2 = data.frame(row = c(5,3,1,15,10),
  20. x2 = c("g","h","i","j","k"))
  21. library(powerjoin)
  22. power_left_join(df1, df2 |&gt; dplyr::rename(x = x2), by = "row", conflict = coalesce_yx)
  23. #&gt; row x
  24. #&gt; 1 1 i
  25. #&gt; 2 2 b
  26. #&gt; 3 3 h
  27. #&gt; 4 4 d
  28. #&gt; 5 5 g
  29. #&gt; 6 6 f

创建于2023年03月17日,使用 reprex v2.0.2

英文:

We might use {powerjoin}

  1. df1 = data.frame(row = seq(1,6),
  2. x = c(&quot;a&quot;,&quot;b&quot;,&quot;c&quot;,&quot;d&quot;,&quot;e&quot;,&quot;f&quot;))
  3. df2 = data.frame(row = c(5,3,1,15,10),
  4. x2 = c(&quot;g&quot;,&quot;h&quot;,&quot;i&quot;,&quot;j&quot;,&quot;k&quot;))
  5. library(powerjoin)
  6. power_left_join(df1, df2 |&gt; dplyr::rename(x = x2), by = &quot;row&quot;, conflict = coalesce_yx)
  7. #&gt; row x
  8. #&gt; 1 1 i
  9. #&gt; 2 2 b
  10. #&gt; 3 3 h
  11. #&gt; 4 4 d
  12. #&gt; 5 5 g
  13. #&gt; 6 6 f

<sup>Created on 2023-03-17 with reprex v2.0.2</sup>

答案3

得分: 0

使用dplyr 1.1.0版本:

  1. df1 %>%
  2. rows_update(df2 %>% rename(x = x2), unmatched = "ignore")

结果:

  1. 匹配,按 = "row"
  2. x
  3. 1 1 i
  4. 2 2 b
  5. 3 3 h
  6. 4 4 d
  7. 5 5 g
  8. 6 6 f

如果两个表具有相同的行名称,会更简单:

  1. df1 %>%
  2. rows_update(df2, unmatched = "ignore")
英文:

With dplyr 1.1.0:

  1. df1 %&gt;%
  2. rows_update(df2 %&gt;% rename(x = x2), unmatched = &quot;ignore&quot;)

Result

  1. Matching, by = &quot;row&quot;
  2. row x
  3. 1 1 i
  4. 2 2 b
  5. 3 3 h
  6. 4 4 d
  7. 5 5 g
  8. 6 6 f

If both tables had the same rownames it would be simpler:

  1. df1 %&gt;%
  2. rows_update(df2, unmatched = &quot;ignore&quot;)

huangapple
  • 本文由 发表于 2023年3月4日 05:57:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75632211.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定