ifelse和if_else在返回NA的条件语句中的区别

huangapple go评论95阅读模式
英文:

Difference between ifelse and if_else for conditionals returning NA

问题

以下是您要翻译的内容:

"我搞不定这个。

昨天我发布了一个问题,询问如何在dplyr中逐行应用which.max。我得到了一个很好的答案,解决了我的问题。然而,常常发生的情况是,一个问题得到了答案,又引发了另一个问题。我将用一些示例数据来说明这个问题。

这是一个带有三个变量的数据框。

data.frame(v1 = c(3,2,NA,6),
v2 = c(NA,1,NA,7),
v3 = c(1,1,NA,1)) -> df2

df2

输出

v1 v2 v3

1 3 NA 1

2 2 1 1

3 NA NA NA

4 6 7 1

在dplyr中使用which.max,我想要计算一个新变量,该变量返回每行中最大值的列位置。但是,根据我上面链接的先前帖子中解释的原因,如果行全部为NA,我希望它返回NA,而不是integer(0)。因此,我使用ifelse来应用条件,如果行全部为NA,则返回NA,如果不是,则执行逐行的which.max操作。

library(dplyr)

df2 %>%
rowwise() %>%
mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
yes = NA,
no = which.max(c_across(everything()))))

输出

# A tibble: 4 × 4

# Rowwise:

v1 v2 v3 maxCol

1 3 NA 1 1

2 2 1 1 1

3 NA NA NA NA

4 6 7 1 2

然而,如果我使用dplyr的if_else动词做同样的事情,它会抛出一个错误。

df2 %>%
rowwise() %>%
mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
true = NA,
false = which.max(c_across(everything()))))

错误在mutate()中:

ℹ In argument: maxCol = if_else(...).

ℹ In row 3.

Caused by error in if_else():

! false must have size 1, not size 0.

Run rlang::last_trace() to see where the error occurred.

为什么会发生这种情况?以及如何使if_elseifelse的工作方式相同?
"

请注意,我只翻译了您提供的文本内容,不包括代码部分。如果您需要更多帮助,请随时告诉我。

英文:

I can't work this out.

I posted yesterday asking how to apply which.max rowwise in dplyr. I got a great answer that solved my problem. However, as often happens, getting one question answered raised another one. I will lay the problem out with some toy data

Here is a data frame with three variables.

  1. data.frame(v1 = c(3,2,NA,6),
  2. v2 = c(NA,1,NA,7),
  3. v3 = c(1,1,NA,1)) -> df2
  4. df2
  5. # output
  6. #
  7. # v1 v2 v3
  8. # 1 3 NA 1
  9. # 2 2 1 1
  10. # 3 NA NA NA
  11. # 4 6 7 1

Using which.max within dplyr, I want to calculate a new variable that returns the column position of the maximum value in each row. However, for reasons explained in my previous post linked above, if the row is all NA I want it to return NA instead of integer(0). So I use ifelse to apply a conditional whereby if the row is all NA it returns NA and, if not, it performs the rowwise which.max operation.

  1. library(dplyr)
  2. df2 %>%
  3. rowwise() %>%
  4. mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
  5. yes = NA,
  6. no = which.max(c_across(everything()))))
  7. # output
  8. # # A tibble: 4 × 4
  9. # # Rowwise:
  10. # v1 v2 v3 maxCol
  11. # <dbl> <dbl> <dbl> <int>
  12. # 1 3 NA 1 1
  13. # 2 2 1 1 1
  14. # 3 NA NA NA NA
  15. # 4 6 7 1 2

However if I do the same thing using the dplyr if_else verb it throws an error.

  1. df2 %>%
  2. rowwise() %>%
  3. mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
  4. true = NA,
  5. false = which.max(c_across(everything()))))
  6. # Error in `mutate()`:
  7. # ℹ In argument: `maxCol = if_else(...)`.
  8. # ℹ In row 3.
  9. # Caused by error in `if_else()`:
  10. # ! `false` must have size 1, not size 0.
  11. # Run `rlang::last_trace()` to see where the error occurred.

Why is this happening? and how do I make if_else work the same way as ifelse?

答案1

得分: 1

以下是您要翻译的部分:

我的解决方案(针对给定问题)是不使用ifelseif_else。对我来说,最简单的方法似乎是定义以下内容:

  1. which.max2 <- function(x) {
  2. x <- which.max(x)
  3. if (length(x) == 0) {
  4. NA
  5. } else {
  6. x
  7. }
  8. }

然后运行:

  1. df2 %>%
  2. mutate(
  3. maxCol = apply(across(everything()), 1, which.max2)
  4. )

或者,我们可以将它放在if_else语句中:

  1. df2 %>%
  2. mutate(
  3. maxCol = if_else(if_all(everything(), is.na),
  4. NA,
  5. apply(across(everything()), 1, which.max2)
  6. )
  7. )

但那只是更加冗长。

英文:

My solution (to the given problem) would be to not use ifelse or if_else. To me, the easiest appears to be to define the following:

  1. which.max2 &lt;- function(x) {
  2. x &lt;- which.max(x)
  3. if (length(x) == 0) {
  4. NA
  5. } else {
  6. x
  7. }
  8. }

And then run:

  1. df2 %&gt;%
  2. mutate(
  3. maxCol = apply(across(everything()), 1, which.max2)
  4. )

Alternatively, we could put it in the the if_else statement:

  1. df2 %&gt;%
  2. mutate(
  3. maxCol = if_else(if_all(everything(), is.na),
  4. NA,
  5. apply(across(everything()), 1, which.max2)
  6. )
  7. )

But that is just more verbose.

答案2

得分: 1

为了阐述评论中的见解(并回答“为什么”问题),请考虑以下代码输出:

  1. if_else(all(is.na(c(NA,NA,NA))),
  2. NA,
  3. which.max(c(NA,0,NA)))
  4. # [1] NA

与以下代码比较:

  1. if_else(all(is.na(c(NA,NA,NA))),
  2. NA,
  3. which.max(c(NA,NA,NA)))
  4. # Error in `if_else()`:
  5. # ! `false` must have size 1, not size 0.
  6. # Run `rlang::last_trace()` to see where the error occurred.

后者是发生在你的代码中的情况。if_else 尝试评估错误语句,即使条件为真。which.max(c(NA,NA,NA)) 的输出是 integer(0),这是 if_else 非常不喜欢的。

因此,你提出的第二个问题(如何使 if_elseifelse 的工作方式相同)只能通过 tryCatch() 或一些冗长的变通方法来实现,正如另一个答案所建议的那样。

英文:

To illustrate the insights from the comments (and answer the "why" question), consider the output of

  1. if_else(all(is.na(c(NA,NA,NA))),
  2. NA,
  3. which.max(c(NA,0,NA)))
  4. # [1] NA

compared to

  1. if_else(all(is.na(c(NA,NA,NA))),
  2. NA,
  3. which.max(c(NA,NA,NA)))
  4. # Error in `if_else()`:
  5. # ! `false` must have size 1, not size 0.
  6. # Run `rlang::last_trace()` to see where the error occurred.

where the latter is what is happening in your code. if_else is trying to evaluate the false statement, even though the condition is true. The output of which.max(c(NA,NA,NA)) is integer(0), which if_else really does not like.

Because of this, the second part of your question (how do I make if_else work the same way as ifelse) could only be obtained through tryCatch(), or through some verbose workaround, as the other answer suggests.

huangapple
  • 本文由 发表于 2023年5月25日 05:18:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76327459.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定