英文:
Using parse_expr(), quo_name(), and enquo() to define a character object for plotting country-wise graphs in ggplot
问题
The first line of the code defines a Country_name
object using the rlang
package. When you try to run it separately, you encounter an error because the enquo
function requires a symbol as its argument, but you are passing a character string ('United States'
) instead.
第一行的代码使用rlang
包定义了一个Country_name
对象。当你尝试单独运行它时,会出现错误,因为enquo
函数需要一个符号作为其参数,但你传递的是一个字符字符串('United States'
)。
Here is the corrected code:
以下是已更正的代码:
parse_expr(quo_name(enquo(United States)))
Make sure to pass the symbol United States
without quotes for it to work properly.
确保传递没有引号的符号United States
以使其正常工作。
英文:
I have a function
from a source that uses a couple of inputs, including country name, and return a graph for that country. The first line of the function defines a Country_name
object as something that I cannot understand. When I tried to pull out that part from the function and run it separately, it returns an error while it works fine inside the function. Anyone has the opinion why this happened and what is the purpose of that line of code for Country_name?
function(df, dfline, Country_name){
Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
df %>%
filter(Country == Country_name ...
}
Pull out the first line and run it separately returns an error:
parse_expr(quo_name(enquo('United States')))
### Error in `enquo()`:
### ! `arg` must be a symbol
答案1
得分: 4
假设这是你的数据集:
df1 <- tribble(~ Country, ~ Value,
'Brazil', 1,
'Brazil', 2,
'Canada', 3,
'Canada', 4)
> df1
# A tibble: 4 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
3 Canada 3
4 Canada 4
你可以简单地编写自定义的筛选函数如下:
fun1 <- function(df, Country_name){
df %>%
filter(Country == Country_name)
}
> fun1(df1, 'Brazil')
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
但是假设你想要省略'Brazil'
周围的引号并且仍然获得相同的输出。如果不做任何修改,你会得到一个错误:
> fun1(df1, Brazil)
# ...
#! object 'Brazil' not found
# ...
R将Brazil
理解为一个变量,并在全局环境中查找它。它找不到它,然后返回错误。如果Brazil
是一个变量,你可能会得到奇怪的结果:
Brazil <- 'Canada'
> fun1(df1, Brazil)
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Canada 3
2 Canada 4
R看到Brazil
具有'Canada'
的值,将该值绑定到Country_name
,然后在筛选中使用该值。
这不是你想要的。你想要获得实际的单词Brazil
,而不是它代表的值。这就是你所提到的那一行的作用。我将解释它是如何工作的。
第一步是告诉R:“我不希望你评估你收到的参数,我只想保存它的文本”。也就是说,我们想要延迟评估传递给Country_name
的表达式。可以用几种方法来实现:
- 在基本R中使用
substitute(Country_name)
,正如Nir Graham所指出的;
substitute
返回**(未评估)表达式**的解析树…-substitute
的帮助页面。
- 使用rlang中的
enquo(Country_name)
,就像你的函数所做的那样。
enquo()
和enquos()
解除函数参数。解除的表达式可以被检查、修改并注入到其他表达式中。-enquo
的帮助页面。
- 使用rlang中的
enexpr(Country_name)
,也正如Nir Graham所指出的;
enexpr()
和enexprs()
类似于enquo()
和enquos()
,但返回裸表达式而不是quosures。-enexpr
的帮助页面
因此,它们都有非常相似的效果。最大的区别是enquo
“返回quosures而不是裸表达式”。简单来说,_quosures_是表达式,还指向应找到其相关变量值的环境*。我们不需要这个(但这也不是问题),因为将不会评估所讨论的表达式,我们只想要它的文本。
之后,我们只需要获取解除表达式的文本,可以使用:
as.character()
;deparse1()
;rlang::quo_name()
;rlang::expr_name()
。
等等。因此,选项与Nir Graham所做的类似:
fun2_base <- function(df, Country_name){
Country_name <- deparse1(substitute(Country_name))
df %>%
filter(Country == Country_name)
}
fun2_rlang <- function(df, Country_name){
Country_name <- as.character(enexpr(Country_name))
df %>%
filter(Country == Country_name)
}
fun2_base(df1, Brazil)
fun2_rlang(df1, Brazil)
都会产生相同的结果:
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
请注意,我们不需要删除那个Brazil
变量,因为它不会被评估。
英文:
Assume this was your dataset:
df1 <- tribble(~ Country, ~ Value,
'Brazil', 1,
'Brazil', 2,
'Canada', 3,
'Canada', 4)
> df1
# A tibble: 4 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
3 Canada 3
4 Canada 4
You could write your custom filter function simply as:
fun1 <- function(df, Country_name){
df %>%
filter(Country == Country_name)
}
> fun1(df1, 'Brazil')
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
But imagine you want to be able to omit the quotes around 'Brazil'
and still get the same output. If you made no modification you would get an error:
> fun1(df1, Brazil)
# ...
#! object 'Brazil' not found
# ...
R is understanding Brazil
as a variable, and is looking for it in your global environment. It is failing to find it, and then, it returns an error. If Brazil
were a variable, you could get weird results:
Brazil <- 'Cadada'
> fun2(df1, Brazil)
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Canada 3
2 Canada 4
R is seeing that Brazil
has the value of 'Canada'
, binding that value to Country_name
, and using that value on the filter.
That's not what you wanted. You wanted to get the actual word Brazil
, and not the value it represents. That is what the line you were referring to does. I'll explain how it works below.
The first step is saying to R "I don't want you to evaluate the argument you received, I just want you to save it's text". That is, we want to delay the evaluation of the expression that was passed onto Country_name
. That can be done in several ways:
-
substitute(Country_name)
in base R, as Nir Graham noted;
> substitute returns the parse tree for the (unevaluated) expression ... -substitute's help page. -
enquo(Country_name)
with rlang, as your function did.
> enquo() and enquos() defuse function arguments. A defused expression can be examined, modified, and injected into other expressions. -enquo's help page. -
enexpr(Country_name)
with rlang, also as Nir Graham noted;
> enexpr() and enexprs() are like enquo() and enquos() but return naked expressions instead of quosures. -enexpr's help page
So they all have very similar effects. The biggest difference is that enquo
"return quosures instead of naked expressions". In simple terms, quosures are expressions that also point to the environment where the value for it's relevant variables should be found*. We don't need that (but it's also not a problem), as the expression in question wont be evaluated, we just want it's text.
After that we just want to get the text of that defused expression, which can be made with:
as.character()
;deparse1()
;rlang::quo_name()
;rlang::expr_name()
.
And others. Thus, the options are similar to what Nir Graham did:
fun2_base <- function(df, Country_name){
Country_name <- deparse1(substitute(Country_name))
df %>%
filter(Country == Country_name)
}
fun2_rlang <- function(df, Country_name){
Country_name <- as.character(enexpr(Country_name))
df %>%
filter(Country == Country_name)
}
fun2_base(df1, Brazil)
fun2_rlang(df1, Brazil)
All yield:
# A tibble: 2 × 2
Country Value
<chr> <dbl>
1 Brazil 1
2 Brazil 2
Note that we didn't needed to remove that Brazil
variable, because it's not being evaluated.
*: To know more, read about tidy evaluation and the metaprogramming chapters of "Advanced R"
答案2
得分: 2
首先,让我们构建一个最小的 reprex。
然后,让我们使用 boomer 打印中间输出:
我们可以看到:
enquo()
捕获输入并生成一个 quosure。quo_name()
提取表达式作为字符串。parse_expr()
从字符串构建一个符号。- 这个符号用于相等性比较(在这里它被强制转换为字符,尝试
quote(a) == "a"
来查看它是如何工作的)。
如果我们想更好地了解这些对象,我们可以在 print 参数中使用 {constructive}。它不会打印对象,而是打印用于重建它们的代码。
底线是,代码有点臃肿,也有点奇怪和不安全,你不应该提供字符串作为变量,只是为了节省双引号,那如果要提供 "United Kingdom" 该怎么办呢?
正确的方法是简单地将 Country_name
提供为字符串,并且有:
或者为了更安全,以防 df
可能包含与参数发生冲突的 Country_name
列:
或者
fun <- function(df, Country_name){
df %>%
filter(Country == !!Country_name)
}
英文:
First let's build a minimal reprex
library(dplyr, warn.conflicts = FALSE)
fun <- function(df, Country_name){
Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
df %>%
filter(Country == Country_name)
}
df <- data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
df
#> x Country
#> 1 1 Belgium
#> 2 2 Ukraine
fun(df, Ukraine)
#> x Country
#> 1 2 Ukraine
Then let's use boomer to print intermediate outputs :
fun1 <- boomer::rig(fun)
fun1(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name)))
#> · 💣 quo_name(enquo(Country_name))
#> · · 💣 💥 enquo(Country_name)
#> · · <quosure>
#> · · expr: ^Ukraine
#> · · env: global
#> · ·
#> · 💥 quo_name(enquo(Country_name))
#> · [1] "Ukraine"
#> ·
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name)))
#> Ukraine
#>
#> 💣 df %>% filter(Country == Country_name)
#> · 💣 filter(., Country == Country_name)
#> · · df :
#> · · x Country
#> · · 1 1 Belgium
#> · · 2 2 Ukraine
#> · · Country_name :
#> · · Ukraine
#> · · 💣 💥 Country == Country_name
#> · · [1] FALSE TRUE
#> · ·
#> · 💥 filter(., Country == Country_name)
#> · x Country
#> · 1 2 Ukraine
#> ·
#> 💥 df %>% filter(Country == Country_name)
#> x Country
#> 1 2 Ukraine
#>
#> 👆 fun
#> x Country
#> 1 2 Ukraine
We see that:
enquo()
captures the input into a quosurequo_name()
extract the expression as a stringparse_expr()
build a symbol from the string- This symbol is used in the equality (there's it's coerced to character, try
quote(a) == "a"
to check how this works).
If we want to understand the objects better we might use {constructive} in the print argument. Instead of printing the objects it will print the code to reconstruct them.
# remotes::install_github("cynkra/constructive")
fun2 <- boomer::rig(fun, print = constructive::construct)
fun2(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name)))
#> · 💣 quo_name(enquo(Country_name))
#> · · 💣 💥 enquo(Country_name)
#> · · rlang::new_quosure(quote(Ukraine), .GlobalEnv)
#> · ·
#> · 💥 quo_name(enquo(Country_name))
#> · "Ukraine"
#> ·
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name)))
#> quote(Ukraine)
#>
#> 💣 df %>% filter(Country == Country_name)
#> · 💣 filter(., Country == Country_name)
#> · · df :
#> · · data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
#> · · Country_name :
#> · · quote(Ukraine)
#> · · 💣 💥 Country == Country_name
#> · · c(FALSE, TRUE)
#> · ·
#> · 💥 filter(., Country == Country_name)
#> · data.frame(x = 2L, Country = "Ukraine")
#> ·
#> 💥 df %>% filter(Country == Country_name)
#> data.frame(x = 2L, Country = "Ukraine")
#>
#> 👆 fun
#> x Country
#> 1 2 Ukraine
boomer::boom(fun(df, Ukraine), print = function(x) print(constructive::construct(x)))
#> 💣 💥 fun(df, Ukraine)
#> data.frame(x = 2L, Country = "Ukraine")
#> x Country
#> 1 2 Ukraine
<sup>Created on 2023-06-02 with reprex v2.0.2</sup>
The bottom line is that the code is bloated, and also weird and unsafe, you shouldn't provide strings as variables just to spare double quotes, how will you provide "United Kingdom" ?
The right way to do it is simply to provide Country_name
as a string and have :
fun <- function(df, Country_name){
df %>%
filter(Country == Country_name)
}
Or to be extra safe, in case df
could contain a Country_name
column that would collide with the argument:
fun <- function(df, Country_name){
df %>%
filter(Country == .env$Country_name)
}
or
fun <- function(df, Country_name){
df %>%
filter(Country == !!Country_name)
}
答案3
得分: 0
它使用3个函数调用来实现可以在2个函数调用中完成的任务,无论是在基本环境中还是使用 rlang。
library(dplyr)
library(rlang)
myfilt_base <- function(x){
mysym <- deparse1(substitute(x))
filter(iris, Species == mysym)
}
myfilt_base(versicolor)
myfilt_rlang <- function(x){
mysym <- as.character(enexpr(x))
filter(iris, Species == mysym)
}
myfilt_rlang(virginica)
<details>
<summary>英文:</summary>
Its using 3 function calls to be able to do what is acheivable in 2, whether in base or using rlang.
library(dplyr)
library(rlang)
myfilt_base <- function(x){
mysym <- deparse1(substitute(x))
filter(iris, Species == mysym)
}
myfilt_base(versicolor)
myfilt_rlang <- function(x){
mysym <- as.character(enexpr(x))
filter(iris, Species == mysym)
}
myfilt_rlang(virginica)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论