Tidy eval for `by` in `dplyr::_join`可翻译为:`dplyr::_join` 中的 `by` 的整洁评估

huangapple go评论103阅读模式
英文:

Tidy eval for `by` in `dplyr::_join

问题

我正在编写一个函数,使用dplyr::_join来连接两个数据集,其中by参数是无引号传递的。我看到有不少解决这个问题的方法,但似乎都已经过时和/或不建议使用:

  • 在这个答案中使用rlang::quo_textpurrr::map_chr 链接这里,它在文档中已被取代 链接

  • 在这个答案中使用dplyr::ensym 链接,它在文档中标记为“不再推荐正常使用” 链接

如何使用新的tidy eval方法来完成这个任务,可以在 这里 找到,该文强调了使用{{}}的方法。看起来自这些旧答案以来,tidy eval已经更新,这就是我再次提问的原因 - 请让我知道是否应该简单地使用这些旧答案中的一个。

这里有一个示例来演示我的问题:

  1. data("iris")
  2. iris2 <- iris %>%
  3. select(Species) %>%
  4. filter(Species != "versicolor") %>%
  5. mutate(id = case_when(Species == "virginica" ~ 1, TRUE ~ 2)) %>%
  6. distinct()
  7. ### 这是我期望的输出
  8. joined_iris <- iris %>%
  9. inner_join(iris2, by = c("Species" = "Species"))
  10. ### 这是一个示例函数,我尝试在其中使用tidy eval
  11. # -- 它不起作用
  12. join_iris <- function(data1, data2, join_col1, join_col2) {
  13. data_out <- data1 %>%
  14. inner_join(
  15. data2,
  16. by = c({{ join_col1 }} = {{ join_col2 }})
  17. )
  18. data_out
  19. }
  20. join_iris(iris, iris2, Species, Species)
英文:

I am writing a function to join two datasets using dplyr::_join where the by terms are parameters passed in without quotes. I have seen quite a few solutions to this issue, but all seem to be dated and/or deprecated:

  • Use rlang::quo_text or purrr::map_chr in this answer and here which is superceded in the docs here

  • Use dplyr::ensym in this answer which is listed as "no longer for normal usage" in the docs here

How would I accomplish this using the new tidy eval methods as found here which emphasizes the use of {{}}? It seems like tidy eval has been updated since these older answers were given, which is why I am asking again -- please let me know if I should simply stick with one of these old answers.

Here is a toy example to demonstrate my question:

  1. data(&quot;iris&quot;)
  2. iris2 &lt;- iris %&gt;%
  3. select(Species) %&gt;%
  4. filter(Species != &quot;versicolor&quot;) %&gt;%
  5. mutate(id = case_when(Species == &quot;virginica&quot; ~ 1, T ~ 2)) %&gt;%
  6. distinct()
  7. ### This is my desired output
  8. joined_iris &lt;- iris %&gt;% inner_join(iris2, by = (&quot;Species&quot; = &quot;Species&quot;))
  9. ### This is an example function where I attempt to use tidy eval
  10. # -- it does not work
  11. join_iris &lt;- function(data1, data2, join_col1, join_col2) {
  12. data_out &lt;- data1 %&gt;%
  13. inner_join(
  14. data2,
  15. by = c({{ join_col1 }} = {{ join_col2 }})
  16. )
  17. data_out
  18. }
  19. join_iris(iris, iris2, Species, Species)

答案1

得分: 2

使用dplyr 1.1.0中的join_by()

  1. join_iris <- function(data1, data2, col1, col2) {
  2. inner_join(
  3. data1,
  4. data2,
  5. by = join_by({{ col1 }} == {{ col2 }})
  6. )
  7. }
英文:

Using join_by() from dplyr 1.1.0:

  1. join_iris &lt;- function(data1, data2, col1, col2) {
  2. inner_join(
  3. data1,
  4. data2,
  5. by = join_by({{ col1 }} == {{ col2 }})
  6. )
  7. }

huangapple
  • 本文由 发表于 2023年3月7日 23:32:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75664007.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定