使用 R 根据条件交换两列之间的数值。

huangapple go评论65阅读模式
英文:

Exchange the values between two columns based on a condition using R

问题

我有以下的数据框。如果"dm"列中的值小于20000,那么该值应该移到"nd"列。同样,如果"nd"列中的值大于20000,那么该值应该移到"dm"列。

structure(list(id = c(1, 2, 3), nd = c(NA, 20076, NA), dm = c(10113, NA, 10188)), class = "data.frame", row.names = c(NA, -3L))

我希望最终的数据框如下所示:

structure(list(id = c(1, 2, 3), nd = c(10113, NA, 10188), dm = c(NA, 20076, NA)), class = "data.frame", row.names = c(NA, -3L))

谢谢!

英文:

I have got the following df. I want if the value in the dm column is less than 20000, then that value should go to the nd column. Similarly, if the value in the nd column is greater then 20000 then that value should go to the dm column

structure(list(id = c(1, 2, 3), nd = c(NA, 20076, NA), dm = c(10113, 
NA, 10188)), class = "data.frame", row.names = c(NA, -3L))

I want my final df to look like this

structure(list(id = c(1, 2, 3), nd = c(10113, NA, 10188), dm = c(NA, 
20076, NA)), class = "data.frame", row.names = c(NA, -3L))

Thank you

答案1

得分: 3

ifelse 在这里是你的好朋友。

base R

transform(df,
  nd = ifelse(dm < 20000, dm, nd), 
  dm = ifelse(nd > 20000, nd, dm)
)
#   id    nd    dm
# 1  1 10113    NA
# 2  2    NA 20076
# 3  3 10188    NA

注意,这在基础 R 中起作用,因为与 dplyr::mutate 不同,对于 dm=(第二个)表达式及其后的计算 不会 看到来自先前表达式的更改,因此它看到的 nd 是原始未更改的 nd

我们也可以使用在下面的 dplyr 示例中说明的临时变量技巧:

df %>%
  transform(
    nd2 = ifelse(dm < 20000, dm, nd),
    dm2 = ifelse(nd > 20000, nd, dm)
  ) %>%
  subset(select = -c(nd, dm))

然后将 nd2 重命名为 nd(等等)。

dplyr

因为 mutate 立即“看到”更改,我们需要存储到其他变量,然后重新分配。

library(dplyr)
df %>%
  mutate(
    nd2 = ifelse(dm < 20000, dm, nd),
    dm2 = ifelse(nd > 20000, nd, dm)
  ) %>%
  select(-nd, -dm) %>%
  rename(nd=nd2, dm=dm2)
#   id    nd    dm
# 1  1 10113    NA
# 2  2    NA 20076
# 3  3 10188    NA
英文:

ifelse is your friend for this.

base R

transform(df,
  nd = ifelse(dm &lt; 20000, dm, nd), 
  dm = ifelse(nd &gt; 20000, nd, dm)
)
#   id    nd    dm
# 1  1 10113    NA
# 2  2    NA 20076
# 3  3 10188    NA

Note that this works in base R because unlike dplyr::mutate, the calculation for the dm= (second) expression (and beyond) does not see the change from the previous expressions, so the nd that it sees is the original, unchanged nd.

We can also use the temporary-variable trick illustrated in the dplyr example below:

df |&gt;
  transform(
    nd2 = ifelse(dm &lt; 20000, dm, nd),
    dm2 = ifelse(nd &gt; 20000, nd, dm)
  ) |&gt;
  subset(select = -c(nd, dm))

and then rename nd2 to nd (etc).

dplyr

Because mutate "sees" the changes immediately, we need to store into other variables and then reassign.

library(dplyr)
df %&gt;%
  mutate(
    nd2 = ifelse(dm &lt; 20000, dm, nd),
    dm2 = ifelse(nd &gt; 20000, nd, dm)
  ) %&gt;%
  select(-nd, -dm) %&gt;%
  rename(nd=nd2, dm=dm2)
#   id    nd    dm
# 1  1 10113    NA
# 2  2    NA 20076
# 3  3 10188    NA

答案2

得分: 1

另一个基于基本的 R 选项使用 apply

as.data.frame(t(apply(df, 1, function(x) {
  if(x[2] > 20000 | x[3] < 20000) x[c(1, 3, 2)] else x})))
#>   id    dm    nd
#> 1  1 10113    NA
#> 2  2    NA 20076
#> 3  3 10188    NA

<sup>创建于 2023-02-18,使用 reprex v2.0.2</sup>

英文:

Another base R option using apply:

as.data.frame(t(apply(df, 1, function(x) {
  if(x[2] &gt; 20000 | x[3] &lt; 20000) x[c(1, 3, 2)] else x})))
#&gt;   id    dm    nd
#&gt; 1  1 10113    NA
#&gt; 2  2    NA 20076
#&gt; 3  3 10188    NA

<sup>Created on 2023-02-18 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年2月19日 07:30:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75497046.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定