替换不同类别的多个列中的NA值。

huangapple go评论61阅读模式
英文:

replace NA in multiple columns of different classes

问题

我有多列类似名称的数据,但属于不同的类别。我需要用0替换所有的NA,但保留列的类别,因为稍后会为不同的字符串分配其他数值。

这是示例数据:

                 tricep_wt = c(2,NA,3))```

这是我尝试将NA更改为0的方法:

mutate(qdf, across(contains("wt"), ~case_when(is.numeric(.x) ~ tidyr::replace_na(., 0),
is.character(.x) ~ tidyr::replace_na(., "0"))))


我收到错误:

Error in mutate():
i In argument: across(...).
Caused by error in across():
! Can't compute column bicep_wt.
Caused by error in case_when():
! Failed to evaluate the right-hand side of formula 1.
Caused by error in vec_assign():
! Can't convert replace <double> to match type of data <character>.


使用相同的方法也会出现错误:

mutate(qdf, across(contains("wt"), ~case_when(is.numeric(.x) ~ tidyr::replace_na(.x, 0),
is.character(.x) ~ tidyr::replace_na(.x, "0"))))


<details>
<summary>英文:</summary>

I have data with multiple columns of similar names, but of different classes. I need to replace all `NA` with `0` but retain the columns class, as different character strings will be assigned other numeric values later.

This is example data:

qdf = data.frame(bicep_wt = c("black band", "5", NA),
tricep_wt = c(2,NA,3))


and this is my attempt to change the NA to 0:

mutate(qdf, across(contains("wt"), ~case_when(is.numeric(.x) ~ tidyr::replace_na(., 0),
is.character(.x) ~ tidyr::replace_na(., "0"))))

I get error:

Error in mutate():
i In argument: across(...).
Caused by error in across():
! Can't compute column bicep_wt.
Caused by error in case_when():
! Failed to evaluate the right-hand side of formula 1.
Caused by error in vec_assign():
! Can't convert replace <double> to match type of data <character>.


same error with:

mutate(qdf, across(contains("wt"), ~case_when(is.numeric(.x) ~ tidyr::replace_na(.x, 0),
is.character(.x) ~ tidyr::replace_na(.x, "0"))))


</details>


# 答案1
**得分**: 2

以下是翻译好的部分:

"While being 'class-safe' is usually a good thing, functions that are not class-safe can be used advantageously."

"虽然通常来说,具有“类安全性”通常是一件好事,但那些不具备类安全性的函数也可以有优势的用途。"

By "class-safe", I mean that you are guaranteed of the class returns from an expression. For instance,

"所谓“类安全性”,指的是你可以从表达式中得到类别安全的返回值。例如,"

The first call is ambiguous, since the `yes=` is class numeric and the `no=` is class character. A class-safe function should complain about this, as in

"第一个调用是模糊的,因为`yes=`的类别是数值,而`no=`的类别是字符。一个具备类安全性的函数应该会抱怨这一点,如下所示:"

Note that `ifelse` and `replace` are _not_ class-safe, but in this case it is acceptable based on your requirements.

"请注意,`ifelse` 和 `replace` _不具备_ 类安全性,但在这种情况下,根据您的需求,这是可以接受的。"

<details>
<summary>英文:</summary>

While being &quot;class-safe&quot; is usually a good thing, functions that are not class-safe can be used advantageously.

```r
mutate(qdf, across(contains(&quot;wt&quot;), ~ replace(.x, is.na(.x), 0)))
#     bicep_wt tricep_wt
# 1 black band         2
# 2          5         0
# 3          0         3

By "class-safe", I mean that you are guaranteed of the class returns from an expression. For instance,

ifelse(c(T, T), 1, &quot;1&quot;)
# [1] 1 1
ifelse(c(T, F), 1, &quot;1&quot;)
# [1] &quot;1&quot; &quot;1&quot;

The first call is ambiguous, since the yes= is class numeric and the no= is class character. A class-safe function should complain about this, as in

dplyr::if_else(c(T, T), 1, &quot;1&quot;)
# Error in dplyr::if_else(c(T, T), 1, &quot;1&quot;) : 
#   Can&#39;t combine `true` &lt;double&gt; and `false` &lt;character&gt;.
data.table::fifelse(c(T, T), 1, &quot;1&quot;)
# Error in data.table::fifelse(c(T, T), 1, &quot;1&quot;) : 
#   &#39;yes&#39; is of type double but &#39;no&#39; is of type character. Please make sure that both arguments have the same type.

Note that ifelse and replace are not class-safe, but in this case it is acceptable based on your requirements.

答案2

得分: 1

你可以将条件传递给 replace_na 函数的内部:

mutate(qdf, across(contains(&quot;wt&quot;), ~ replace_na(.x, `if`(is.numeric(.x), 0, &quot;0&quot;))))

结果:

    bicep_wt tricep_wt
1 black band         2
2          5         0
3          0         3

注:它可以使用 ifelse()`if`()

英文:

You can pass the condition to inside replace_na:

mutate(qdf, across(contains(&quot;wt&quot;), ~ replace_na(.x, `if`(is.numeric(.x), 0, &quot;0&quot;))))

Result:

    bicep_wt tricep_wt
1 black band         2
2          5         0
3          0         3

Obs: it works with either ifelse() or `if`().

答案3

得分: 0

mutate(
qdf,
across(
contains("wt"),
(.x) { # could use ~ instead
if (is.numeric(.x)) tidyr::replace_na(.x, 0)
else if (is.character(.x)) tidyr::replace_na(.x, "0")
}
)
)

bicep_wt tricep_wt

1 black band 2

2 5 0

3 0 3

英文:
mutate(
  qdf, 
  across(
    contains(&quot;wt&quot;), 
    \(.x) { # could use ~ instead
      if      (is.numeric(.x)  ) tidyr::replace_na(.x, 0)
      else if (is.character(.x)) tidyr::replace_na(.x, &quot;0&quot;)
    }
  )
)


#     bicep_wt tricep_wt
# 1 black band         2
# 2          5         0
# 3          0         3

答案4

得分: 0

  1. 用循环的方式:
for (col in colnames(qdf)) {
  qdf[is.na(qdf[, col]), col] <- 0
}
  1. 使用函数和 lapply()
naToZero <- function(v) {
  v[is.na(v)] <- 0
  return(v)
}
qdf <- data.frame(lapply(X=qdf, FUN=function(u) naToZero(u)))
qdf
    bicep_wt tricep_wt
1 black band         2
2          5         0
3          0         3
英文:

If you don't know that your columns will be only numeric or character, you can do a version that will substitute the correct type of NA no matter what.

1) With a loop

for (col in colnames(qdf)) {
  qdf[is.na(qdf[, col]), col] &lt;- 0
}

2) With a function and lapply()

naToZero &lt;- function(v) {
  v[is.na(v)] &lt;- 0
  return(v)
}
qdf &lt;- data.frame(lapply(X=qdf, FUN=function(u) naToZero(u)))
qdf
    bicep_wt tricep_wt
1 black band         2
2          5         0
3          0         3

huangapple
  • 本文由 发表于 2023年6月26日 22:43:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76557751.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定