英文:
Difference between ifelse and if_else for conditionals returning NA
问题
以下是您要翻译的内容:
"我搞不定这个。
昨天我发布了一个问题,询问如何在dplyr中逐行应用which.max。我得到了一个很好的答案,解决了我的问题。然而,常常发生的情况是,一个问题得到了答案,又引发了另一个问题。我将用一些示例数据来说明这个问题。
这是一个带有三个变量的数据框。
data.frame(v1 = c(3,2,NA,6),
v2 = c(NA,1,NA,7),
v3 = c(1,1,NA,1)) -> df2
df2
输出
v1 v2 v3
1 3 NA 1
2 2 1 1
3 NA NA NA
4 6 7 1
在dplyr中使用which.max,我想要计算一个新变量,该变量返回每行中最大值的列位置。但是,根据我上面链接的先前帖子中解释的原因,如果行全部为NA,我希望它返回NA,而不是integer(0)。因此,我使用ifelse来应用条件,如果行全部为NA,则返回NA,如果不是,则执行逐行的which.max操作。
library(dplyr)
df2 %>%
rowwise() %>%
mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
yes = NA,
no = which.max(c_across(everything()))))
输出
# A tibble: 4 × 4
# Rowwise:
v1 v2 v3 maxCol
1 3 NA 1 1
2 2 1 1 1
3 NA NA NA NA
4 6 7 1 2
然而,如果我使用dplyr的if_else动词做同样的事情,它会抛出一个错误。
df2 %>%
rowwise() %>%
mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
true = NA,
false = which.max(c_across(everything()))))
错误在mutate()中:
ℹ In argument: maxCol = if_else(...).
ℹ In row 3.
Caused by error in if_else():
! false must have size 1, not size 0.
Run rlang::last_trace() to see where the error occurred.
为什么会发生这种情况?以及如何使if_else与ifelse的工作方式相同?
"
请注意,我只翻译了您提供的文本内容,不包括代码部分。如果您需要更多帮助,请随时告诉我。
英文:
I can't work this out.
I posted yesterday asking how to apply which.max rowwise in dplyr. I got a great answer that solved my problem. However, as often happens, getting one question answered raised another one. I will lay the problem out with some toy data
Here is a data frame with three variables.
data.frame(v1 = c(3,2,NA,6),
v2 = c(NA,1,NA,7),
v3 = c(1,1,NA,1)) -> df2
df2
# output
#
# v1 v2 v3
# 1 3 NA 1
# 2 2 1 1
# 3 NA NA NA
# 4 6 7 1
Using which.max within dplyr, I want to calculate a new variable that returns the column position of the maximum value in each row. However, for reasons explained in my previous post linked above, if the row is all NA I want it to return NA instead of integer(0). So I use ifelse to apply a conditional whereby if the row is all NA it returns NA and, if not, it performs the rowwise which.max operation.
library(dplyr)
df2 %>%
rowwise() %>%
mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
yes = NA,
no = which.max(c_across(everything()))))
# output
# # A tibble: 4 × 4
# # Rowwise:
# v1 v2 v3 maxCol
# <dbl> <dbl> <dbl> <int>
# 1 3 NA 1 1
# 2 2 1 1 1
# 3 NA NA NA NA
# 4 6 7 1 2
However if I do the same thing using the dplyr if_else verb it throws an error.
df2 %>%
rowwise() %>%
mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
true = NA,
false = which.max(c_across(everything()))))
# Error in `mutate()`:
# ℹ In argument: `maxCol = if_else(...)`.
# ℹ In row 3.
# Caused by error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
Why is this happening? and how do I make if_else work the same way as ifelse?
答案1
得分: 1
以下是您要翻译的部分:
我的解决方案(针对给定问题)是不使用ifelse或if_else。对我来说,最简单的方法似乎是定义以下内容:
which.max2 <- function(x) {
x <- which.max(x)
if (length(x) == 0) {
NA
} else {
x
}
}
然后运行:
df2 %>%
mutate(
maxCol = apply(across(everything()), 1, which.max2)
)
或者,我们可以将它放在if_else语句中:
df2 %>%
mutate(
maxCol = if_else(if_all(everything(), is.na),
NA,
apply(across(everything()), 1, which.max2)
)
)
但那只是更加冗长。
英文:
My solution (to the given problem) would be to not use ifelse or if_else. To me, the easiest appears to be to define the following:
which.max2 <- function(x) {
x <- which.max(x)
if (length(x) == 0) {
NA
} else {
x
}
}
And then run:
df2 %>%
mutate(
maxCol = apply(across(everything()), 1, which.max2)
)
Alternatively, we could put it in the the if_else statement:
df2 %>%
mutate(
maxCol = if_else(if_all(everything(), is.na),
NA,
apply(across(everything()), 1, which.max2)
)
)
But that is just more verbose.
答案2
得分: 1
为了阐述评论中的见解(并回答“为什么”问题),请考虑以下代码输出:
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,0,NA)))
# [1] NA
与以下代码比较:
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,NA,NA)))
# Error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
后者是发生在你的代码中的情况。if_else 尝试评估错误语句,即使条件为真。which.max(c(NA,NA,NA)) 的输出是 integer(0),这是 if_else 非常不喜欢的。
因此,你提出的第二个问题(如何使 if_else 与 ifelse 的工作方式相同)只能通过 tryCatch() 或一些冗长的变通方法来实现,正如另一个答案所建议的那样。
英文:
To illustrate the insights from the comments (and answer the "why" question), consider the output of
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,0,NA)))
# [1] NA
compared to
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,NA,NA)))
# Error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
where the latter is what is happening in your code. if_else is trying to evaluate the false statement, even though the condition is true. The output of which.max(c(NA,NA,NA)) is integer(0), which if_else really does not like.
Because of this, the second part of your question (how do I make if_else work the same way as ifelse) could only be obtained through tryCatch(), or through some verbose workaround, as the other answer suggests.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论