重编码连续变量

huangapple go评论93阅读模式
英文:

Recoding continuous variable

问题

我有一个连续变量。条目1-60需要保持不变。NAs和0被编码为大于60的数字。

英文:

I have a continuous variable. Entries 1-60 need to stay the same. NAs and 0s are coded as a number above 60.

答案1

得分: 2

使用 `dplyr`,你可以使用

- `recode()`

```r
df %>%
  mutate(y = recode(x, `88` = 0, `99` = NA_real_))
  • case_match()
df %>%
  mutate(y = case_match(x, 88 ~ 0, 99 ~ NA, .default = x))
  • case_when()
df %>%
  mutate(y = case_when(x == 88 ~ 0, x == 99 ~ NA, .default = x))

<details>
<summary>英文:</summary>

With `dplyr`, you can use

- `recode()`

```r
df %&gt;%
  mutate(y = recode(x, `88` = 0, `99` = NA_real_))
  • case_match()
df %&gt;%
  mutate(y = case_match(x, 88 ~ 0, 99 ~ NA, .default = x))
  • case_when()
df %&gt;%
  mutate(y = case_when(x == 88 ~ 0, x == 99 ~ NA, .default = x))

答案2

得分: 1

使用 `fcase`

库(data.table)
设定数据表(setDT)(df)[, y := fcase(!x %in% c(88, 99), x, x == 88, 0)]


<details>
<summary>英文:</summary>

Using `fcase`

library(data.table)
setDT(df)[, y := fcase(!x %in% c(88, 99), x, x == 88, 0)]

答案3

得分: 0

使用 tidyverse 软件包(例如 dplyrtidyr),您有多种选择。其中一种选择是使用 na_if 将 99 转换为 NA,并使用 if_else 将 88 转换为 0。

我已经创建了一个虚拟数据集如下,但如果您对特定数据集有疑问,应提供一个具有您自己数据的可重现示例

library(tidyverse)
a <- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = TRUE)
b <- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = TRUE)
c <- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = TRUE)
df <- data.frame(a, b, c)
df

df %>%
  mutate(across(everything(),  ~na_if(., 99))) %>%
  mutate(across(everything(),  ~if_else(. == 88, 0, .)))

只有代码部分已被翻译。

英文:

You have a lot of options at your disposal with the tidyverse packages (e.g., dplyr, tidyr). One option is to use na_if to turn the 99s into NA and if_else to turn the 88s to 0.

I have created a fake dataset below, but if you have questions about your specific dataset, you should provide a reproducible example with your own data.

library(tidyverse)
a &lt;- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = T)
b &lt;- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = T)
c &lt;- sample(x = c(1, 2, 3, 4, 99, 88), size = 30, replace = T)
df &lt;- data.frame(a, b, c)
df

df %&gt;%
  mutate(across(everything(),  ~na_if(., 99))) %&gt;%
  mutate(across(everything(),  ~if_else(. == 88, 0, .)))

答案4

得分: 0

我们可以使用基本的 R 代码来就地更新匹配的值。

df$y[df$y == 99] <- NA
df$y[df$y == 88] <- 0
英文:

We can update matching values inplace with base R

df$y[df$y == 99] &lt;- NA
df$y[df$y == 88] &lt;- 0

</details>



huangapple
  • 本文由 发表于 2023年2月16日 09:58:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75467123.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定