mutate()函数在列中用均值替换-1,但所有值都无条件替换。

huangapple go评论69阅读模式
英文:

mutate() R replace -1 in column with the mean but all values are unconditionally replaced

问题

问题

尝试将列中的-1值用数据框列(HRS1)的mean()值进行填充或替换。我曾经使用了str_replace(),现在使用replace()来替代-1,该列是数值型的,但是当测试-1时,数据框列的所有值都会更改为mutate上的mean()

这不是缺失值的填充/替代,而是需要替换 -1 值的列。

代码

mean_HRS1 = mean(df_survey$HRS1)

df_survey %>% 
    mutate(HRS1 = replace(HRS1, -1, mean_HRS1))
英文:

Problem

Just trying to impute or replace a -1 value in column with the mean() value of data.frame column (HRS1). I had str_replace() and now replace() for -1, the column is numeric, but when test for -1, then ALL values of the dataframe column are changed to mean() on mutate.

This is NOT a missing value imputation/replacement, but a column that has -1 values that need to be replaced.

Code

mean_HRS1 = mean(df_survey$HRS1)

df_survey %>% 
    mutate(HRS1 = replace(HRS1, -1, mean_HRS1))

答案1

得分: 1

你应该将逻辑向量 HRS1 == -1 或索引向量 which(HRS1 == -1) 传递给 replace 函数。

另一个问题是,在计算均值时应该排除 -1。给定 HRS1c(6, 7, 8, -1)mean(HRS1) 实际上是 (6+7+8-1) / 4,但你所需的应该是 (6+7+8) / 3

df_survey %>%
  mutate(HRS1 = replace(HRS1, HRS1 == -1, mean(HRS1[HRS1 != -1])))
英文:

You should pass an logical vector, HRS1 == -1, or an index vector, which(HRS1 == -1), into replace.

Another issue is that -1 should be excluded when computing the mean. Given HRS1 is c(6, 7, 8, -1). mean(HRS1) is actually (6+7+8-1) / 4, but what you need should be (6+7+8) / 3.

df_survey %>% 
  mutate(HRS1 = replace(HRS1, HRS1 == -1, mean(HRS1[HRS1 != -1])))

huangapple
  • 本文由 发表于 2023年3月9日 12:35:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75680473.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定