如何使用dplyr(1.1.0)中的across()与共享参数的函数列表。

huangapple go评论72阅读模式
英文:

How to use dplyr (1.1.0) across() with list of functions using shared arguments

问题

##### 使用虚拟数据和函数进行快速设置
加载 dplyr
`library(dplyr)`

设置一个包含一列的简单数据框
`data <- data.frame(a = 1:5)`

定义两个函数
`newfun1 <- function(x, val) {x + val}`
`newfun2 <- function(x, val) {x * val}`

将函数存储为命名列表
`usefuns <- stats::setNames(as.list(c(newfun1, newfun2)), c("fun1", "fun2"))`

[函数的命名列表](https://i.stack.imgur.com/hifso.png)

##### 目标:将`usefuns`中的每个函数应用于`data`的列`a`,指定`val`参数应为`100`

在使用 `dplyr < 1.1.0` 时,可以轻松实现:
`data %>% mutate(across(.col = a, .fns = usefuns, val = 100))`

[上述代码的结果; 包含三列的数据框](https://i.stack.imgur.com/aA4as.png)

然而,使用 `dplyr 1.1.0` 时,会收到以下警告:
> 从 dplyr 1.1.0 开始,`across()` 中的 `...` 参数已弃用。
请通过匿名函数直接提供参数给 `.fns`# 以前
`across(a:b, mean, na.rm = TRUE)`
# 现在
`across(a:b, ~(x) mean(x, na.rm = TRUE))`

可以使用以下方式使其在 `dplyr 1.1.0` 中工作:
`data %>% mutate(across(.col = a, .fns = list(fun1 = ~newfun1(.x, val = 100), fun2 = ~newfun2(.x, val = 100))))`
或者
`data %>% mutate(across(.col = a, .fns = list(fun1 = ~usefuns$fun1(.x, val = 100), fun2 = ~usefuns$fun2(.x, val = 100))))`

[上述代码的结果; 包含三列的数据框](https://i.stack.imgur.com/SkdXR.png)

但我知道肯定有更简单的方法。在我使用的实际情况中,`usefuns` 中包含的函数数量将是可变的,并且有更多参数,但传递给每个函数的参数始终相同。

我觉得我可能遗漏了相对简单的东西,已经浪费了太多时间进行实验。任何指导都将不胜感激!

-----

额外说明,`val` 可能在每次使用时有不同的值:

设置一个包含三列的简单数据框

`data <- data.frame(a = 1:5, b = 6:10, c = 11:15)`

更复杂的应用函数的示例,使用 `dplyr < 1.1.0`
`data %>% mutate(across(.col = c(a, b), .fns = usefuns, val = 100), across(.col = c, .fns = usefuns, val = 200))`

我尝试过各种列出和命名函数的变种,它们是如何存储和调用的,开始尝试使用 `purrr`,但无法像提供的代码一样接近工作...我在想是否可以使用 `partial()` 函数,但无法弄清楚如何/是否能够工作。
英文:
Quick set up with dummy data and functions

Load dplyr
library(dplyr)

Set up simple data frame with one column
data <- data.frame(a = 1:5)

Define two functions
newfun1 <- function(x, val) {x + val}
newfun2 <- function(x, val) {x * val}

Store functions as named list
usefuns <- stats::setNames(as.list(c(newfun1, newfun2)), c("fun1", "fun2"))

named list of functions

Goal: Apply each function in usefuns to data column a, specifying the val argument should be 100

Using dplyr < 1.1.0, I can make it work easily:

data %>% mutate(across(.col = a, .fns = usefuns, val = 100))

results of previous code; data frame with three columns

However, using dplyr 1.1.0, I get this warning:
>The ... argument of across() is deprecated as of dplyr 1.1.0.
Supply arguments directly to .fns through an anonymous function instead.
#Previously
across(a:b, mean, na.rm = TRUE)
#Now
across(a:b, ~(x) mean(x, na.rm = TRUE))

I can make it work with dplyr 1.1.0 using:

data %>% mutate(across(.col = a, .fns = list(fun1 = ~newfun1(.x, val = 100), fun2 = ~newfun2(.x, val = 100))))

or even

data %>% mutate(across(.col = a, .fns = list(fun1 = ~usefuns$fun1(.x, val = 100), fun2 = ~usefuns$fun2(.x, val = 100))))

results of previous code; data frame with three columns

but I know there must be a simpler way. In the real-world scenario that I'm using this, the number of functions contained in usefuns will be variable, and there are several more arguments, but the arguments being passed to each function will always be the same.

I think I'm missing something relatively simple and have already wasted too much time experimenting. Any pointers are appreciated!


As an added note, val may have differing values each time it is used:

Set up simple data frame with three columns

data <- data.frame(a = 1:5, b = 6:10, c = 11:15)

Example of more complicated application of functions using dplyr < 1.1.0:

data %>% mutate(across(.col = c(a, b), .fns = usefuns, val = 100), across(.col = c, .fns = usefuns, val = 200))

I've tried variations on listing and naming functions, how they are stored and called on, started going down the path of using purrr but couldn't get it as close to working as I did with the code provided above... I'm wondering if the partial() function could come into play, but can't quite figure out how/if that would work.

答案1

得分: 0

你可以将你的函数列表传递给 purrr::map/lapply,然后在内部使用 purrr::partial 并传递 val 的值。

library(purrr)
data %>%
  mutate(across(.col = a,
                .fns = purrr::map(usefuns, 
                                  purrr::partial, 
                                  val = 100)))

或者对于更复杂的示例:

data <- data.frame(a = 1:5, b = 6:10, c = 11:15)
data %>%
  mutate(across(.col = c(a, b), 
                .fns = map(usefuns, partial, val = 100)),
         across(.col = c,
                .fns = map(usefuns, partial, val = 200)))

结果如下:

  a  b  c a_fun1 a_fun2 b_fun1 b_fun2 c_fun1 c_fun2
1 1  6 11    101    100    106    600    211   2200
2 2  7 12    102    200    107    700    212   2400
3 3  8 13    103    300    108    800    213   2600
4 4  9 14    104    400    109    900    214   2800
5 5 10 15    105    500    110   1000    215   3000

请注意,我只提供了代码的翻译部分,没有包含其他内容。

英文:

You can pass your function list to purrr::map/lapply and then use purrr::partial within and pass the value of val.

library(purrr)
data %&gt;% 
    mutate(across(.col = a,
                  .fns = purrr::map(usefuns, 
                                    purrr::partial, 
                                    val = 100)))

Or for the more complex example:

data &lt;- data.frame(a = 1:5, b = 6:10, c = 11:15)
data %&gt;% 
    mutate(across(.col = c(a, b), 
                  .fns = map(usefuns, partial, val = 100)),
           across(.col = c,
                  .fns = map(usefuns, partial, val = 200)))
  a  b  c a_fun1 a_fun2 b_fun1 b_fun2 c_fun1 c_fun2
1 1  6 11    101    100    106    600    211   2200
2 2  7 12    102    200    107    700    212   2400
3 3  8 13    103    300    108    800    213   2600
4 4  9 14    104    400    109    900    214   2800
5 5 10 15    105    500    110   1000    215   3000

huangapple
  • 本文由 发表于 2023年6月6日 00:40:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76408437.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定