英文:
Difference between ifelse and if_else for conditionals returning NA
问题
以下是您要翻译的内容:
"我搞不定这个。
昨天我发布了一个问题,询问如何在dplyr中逐行应用which.max
。我得到了一个很好的答案,解决了我的问题。然而,常常发生的情况是,一个问题得到了答案,又引发了另一个问题。我将用一些示例数据来说明这个问题。
这是一个带有三个变量的数据框。
data.frame(v1 = c(3,2,NA,6),
v2 = c(NA,1,NA,7),
v3 = c(1,1,NA,1)) -> df2
df2
输出
v1 v2 v3
1 3 NA 1
2 2 1 1
3 NA NA NA
4 6 7 1
在dplyr中使用which.max
,我想要计算一个新变量,该变量返回每行中最大值的列位置。但是,根据我上面链接的先前帖子中解释的原因,如果行全部为NA,我希望它返回NA
,而不是integer(0)
。因此,我使用ifelse
来应用条件,如果行全部为NA
,则返回NA
,如果不是,则执行逐行的which.max
操作。
library(dplyr)
df2 %>%
rowwise() %>%
mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
yes = NA,
no = which.max(c_across(everything()))))
输出
# A tibble: 4 × 4
# Rowwise:
v1 v2 v3 maxCol
1 3 NA 1 1
2 2 1 1 1
3 NA NA NA NA
4 6 7 1 2
然而,如果我使用dplyr的if_else
动词做同样的事情,它会抛出一个错误。
df2 %>%
rowwise() %>%
mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
true = NA,
false = which.max(c_across(everything()))))
错误在mutate()
中:
ℹ In argument: maxCol = if_else(...)
.
ℹ In row 3.
Caused by error in if_else()
:
! false
must have size 1, not size 0.
Run rlang::last_trace()
to see where the error occurred.
为什么会发生这种情况?以及如何使if_else
与ifelse
的工作方式相同?
"
请注意,我只翻译了您提供的文本内容,不包括代码部分。如果您需要更多帮助,请随时告诉我。
英文:
I can't work this out.
I posted yesterday asking how to apply which.max
rowwise in dplyr. I got a great answer that solved my problem. However, as often happens, getting one question answered raised another one. I will lay the problem out with some toy data
Here is a data frame with three variables.
data.frame(v1 = c(3,2,NA,6),
v2 = c(NA,1,NA,7),
v3 = c(1,1,NA,1)) -> df2
df2
# output
#
# v1 v2 v3
# 1 3 NA 1
# 2 2 1 1
# 3 NA NA NA
# 4 6 7 1
Using which.max
within dplyr, I want to calculate a new variable that returns the column position of the maximum value in each row. However, for reasons explained in my previous post linked above, if the row is all NA I want it to return NA
instead of integer(0)
. So I use ifelse
to apply a conditional whereby if the row is all NA
it returns NA and, if not, it performs the rowwise which.max
operation.
library(dplyr)
df2 %>%
rowwise() %>%
mutate(maxCol = ifelse(test = all(is.na(c_across(everything()))),
yes = NA,
no = which.max(c_across(everything()))))
# output
# # A tibble: 4 × 4
# # Rowwise:
# v1 v2 v3 maxCol
# <dbl> <dbl> <dbl> <int>
# 1 3 NA 1 1
# 2 2 1 1 1
# 3 NA NA NA NA
# 4 6 7 1 2
However if I do the same thing using the dplyr if_else
verb it throws an error.
df2 %>%
rowwise() %>%
mutate(maxCol = if_else(condition = all(is.na(c_across(everything()))),
true = NA,
false = which.max(c_across(everything()))))
# Error in `mutate()`:
# ℹ In argument: `maxCol = if_else(...)`.
# ℹ In row 3.
# Caused by error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
Why is this happening? and how do I make if_else
work the same way as ifelse
?
答案1
得分: 1
以下是您要翻译的部分:
我的解决方案(针对给定问题)是不使用ifelse
或if_else
。对我来说,最简单的方法似乎是定义以下内容:
which.max2 <- function(x) {
x <- which.max(x)
if (length(x) == 0) {
NA
} else {
x
}
}
然后运行:
df2 %>%
mutate(
maxCol = apply(across(everything()), 1, which.max2)
)
或者,我们可以将它放在if_else
语句中:
df2 %>%
mutate(
maxCol = if_else(if_all(everything(), is.na),
NA,
apply(across(everything()), 1, which.max2)
)
)
但那只是更加冗长。
英文:
My solution (to the given problem) would be to not use ifelse
or if_else
. To me, the easiest appears to be to define the following:
which.max2 <- function(x) {
x <- which.max(x)
if (length(x) == 0) {
NA
} else {
x
}
}
And then run:
df2 %>%
mutate(
maxCol = apply(across(everything()), 1, which.max2)
)
Alternatively, we could put it in the the if_else
statement:
df2 %>%
mutate(
maxCol = if_else(if_all(everything(), is.na),
NA,
apply(across(everything()), 1, which.max2)
)
)
But that is just more verbose.
答案2
得分: 1
为了阐述评论中的见解(并回答“为什么”问题),请考虑以下代码输出:
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,0,NA)))
# [1] NA
与以下代码比较:
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,NA,NA)))
# Error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
后者是发生在你的代码中的情况。if_else
尝试评估错误语句,即使条件为真。which.max(c(NA,NA,NA))
的输出是 integer(0)
,这是 if_else
非常不喜欢的。
因此,你提出的第二个问题(如何使 if_else
与 ifelse
的工作方式相同)只能通过 tryCatch()
或一些冗长的变通方法来实现,正如另一个答案所建议的那样。
英文:
To illustrate the insights from the comments (and answer the "why" question), consider the output of
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,0,NA)))
# [1] NA
compared to
if_else(all(is.na(c(NA,NA,NA))),
NA,
which.max(c(NA,NA,NA)))
# Error in `if_else()`:
# ! `false` must have size 1, not size 0.
# Run `rlang::last_trace()` to see where the error occurred.
where the latter is what is happening in your code. if_else
is trying to evaluate the false statement, even though the condition is true. The output of which.max(c(NA,NA,NA))
is integer(0)
, which if_else
really does not like.
Because of this, the second part of your question (how do I make if_else
work the same way as ifelse
) could only be obtained through tryCatch()
, or through some verbose workaround, as the other answer suggests.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论