如何对现有的数据变量应用更改。

huangapple go评论78阅读模式
英文:

how to apply changes to existing data variables

问题

我执行了很多数据转换操作,其中一个让我遇到了一些麻烦。

假设我有一个数据集,其中包含从v1到v100的变量,每个变量的取值范围是1到5。我想要重新编码/更改一些变量(例如从v10到v20)。

我经常使用dplyr包,还有一些来自Daniel Ludecke的其他包,比如sjmisc、sjPlot、sjlabelled等。

如何将这个操作的结果发送到最初的数据集中,覆盖原来的位置(即进行覆盖操作)?

我不想创建额外的变量(例如v10_r、v11_r等),只想进行覆盖操作。我对数据集进行了很多这样的类似更改,希望能够尽可能简单地应用它们。

英文:

I perform lots of data transformation and one of them makes me some troubles.

Let's assume that I have a dataset with variavles from v1 to v100, every one with numbers from 1 to 5. I want to recode/change some of variables (for example from v10 to v20)

I use dplyr package a lot and some another coming from Daniel Ludecke, i.e. sjmisc, sjPlot, sjlabelled etc.

dataset %>%
   select(v10:v20) %>%
   rec(rec = "1:2=1;3:5=2")

How to send the result of this operation to the initial dataset to the same place from which they were retrieved (this will be an overwrite)?

I don't want to create additional variable (i.e. v10_r, v11_r etc), just overwrite. I make a lot of these and similar changes to a dataset and would like to be able to apply them as simply as possible.

答案1

得分: 2

你可以使用dplyr包中的mutate_at函数来实现这个目标:

library(dplyr)

dataset <- dataset %>%
  mutate_at(vars(v10:v20), ~ recode(.x, `1:2` = 1, `3:5` = 2))

这将直接修改原始数据集,在v10到v20的变量上用重新编码的值进行覆盖。

英文:

You can achieve this using the mutate_at function from the dplyr package:

  library(dplyr)

    dataset &lt;- dataset %&gt;%
      mutate_at(vars(v10:v20), ~ recode(.x, `1:2` = 1, `3:5` = 2))

This will modify the original dataset in place, overwriting the variables from v10 to v20 with the recoded values.

答案2

得分: 2

另一个选项是使用dplyr中的case_match,它有效地替代了recode

library(dplyr)

df %>%
  mutate(across(v10:v20,
                ~ case_match(.x,
                            1:2 ~ 1,
                            3:5 ~ 2)))

或者你也可以使用dplyr中的case_when,但比使用case_match稍微冗长一些。

df %>%
  mutate(across(v10:v20,
                ~ case_when(.x %in% (1:2) ~ 1,
                            .x %in% (3:5) ~ 2)))
英文:

Another option is to use case_match from dplyr, which effectively replaced recode:

library(dplyr)

df %&gt;% 
  mutate(across(v10:v20,
                ~ case_match(.x,
                            1:2 ~ 1,
                            3:5 ~ 2)))

Or you can also use case_when from dplyr, but slightly more verbose than using case_match.

df %&gt;%
  mutate(across(v10:v20,
                ~ case_when(.x %in% (1:2) ~ 1,
                            .x %in% (3:5) ~ 2)))

huangapple
  • 本文由 发表于 2023年8月8日 20:00:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76859359.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定