如何传递多个参数给laply?

huangapple go评论65阅读模式
英文:

How to pass more than one argument to laply?

问题

以下是代码的翻译部分:

我想运行一系列的回归模型,并将与特定协变量相关的输出放入一个数组中。我已经能够使用`plyr`包中的`laply`来做到这一点,就像这样:

set.seed(1)
df <- data.frame(x1 = rnorm(100, 0, 1), 
                 x2 = rnorm(100, 0, 1),
                 y1 = rnorm(100, 0, 1), 
                 y2 = rnorm(100, 0, 1), 
                 t2 = sample(0:1, 100, replace = TRUE), 
                 t1 = sample(0:1, 100, replace = TRUE))

run_regressions <- function(model) {
  model <- glm(model, data = df)
  return(summary(model)$coefficients[2,])
}

models <- list("y1 ~ t1 + x1",
               "y2 ~ t2 + x1 + x2")

results <- laply(models, run_regressions)
results

我现在想扩展这个功能,以便我可以单独传递回归模型的不同元素到run_regressions函数中。回归函数的定义如下:

run_regressions2 <- function(outcome, treatment, covariates) {
  model <- glm(paste(outcome, " ~ ", treatment, " + ", covariates), data = df)
  return(summary(model)$coefficients[2,])
}

具体要运行的模型可以指定如下:

models <- list(c("y1", "t1", "x1"),
               c("y2", "t2", "x1 + x2"))

是否有一种方式可以使用laply或其他函数来实现这个目标?

我尝试搜索plyr文档以寻找类似的示例,并在Stack Overflow上搜索类似的问题,但没有找到任何类似的内容。

英文:

I would like to run a series of regression models and put the output relating to a particular covariate into an array. I've been able to do this with laply from the plyr package, like this:

set.seed(1)
df &lt;- data.frame(x1 = rnorm(100, 0, 1), 
                 x2 = rnorm(100, 0, 1),
                 y1 = rnorm(100, 0, 1), 
                 y2 = rnorm(100, 0, 1), 
                 t2 = sample(0:1, 100, replace = TRUE), 
                 t1 = sample(0:1, 100, replace = TRUE))

run_regressions &lt;- function(model) {
  model &lt;- glm(model, data = df)
  return(summary(model)$coefficients[2,])
}

models &lt;- list(&quot;y1 ~ t1 + x1&quot;,
               &quot;y2 ~ t2 + x1 + x2&quot;)

results &lt;- laply(models, run_regressions)
results

I would now like to extend this so that I can pass the different elements of the regression models to the run_regressions function separately. The regression function would be defined like

run_regressions2 &lt;- function(outcome, treatment, covariates) {
  model &lt;- glm(paste(outcome, &quot; ~ &quot;, treatment, &quot; + &quot;, covariates), data = df)
  return(summary(model)$coefficients[2,])
}

with the specific models to run specified as something like

models &lt;- list(c(&quot;y1&quot;, &quot;t1&quot;, &quot;x1&quot;),
               c(&quot;y2&quot;, &quot;t2&quot;, &quot;x1 + x2&quot;))

Is there some way to use laply or another function to achieve this?

I've tried searching the plyr documentation for similar examples and searching SO for similar questions, but have not come up with anything.

答案1

得分: 2

正如@jdobres提到的,plyr 已经被长期弃用。有许多现代替代品,如dplyrpurrr 包。

一种选择是在基本的 R 中使用 lapply。在这种情况下,第一个情况将改为:

models <- list("y1 ~ t1 + x1", "y2 ~ t2 + x1 + x2")
results <- lapply(models, run_regressions)
results

#[[1]]
#   Estimate  Std. Error     t value    Pr(>|t|) 
#-0.42628388  0.20517578 -2.07765211  0.04038316 

#[[2]]
#  Estimate Std. Error    t value   Pr(>|t|) 
#-0.2966993  0.1991987 -1.4894639  0.1396433 

在第二种情况下,您可以继续使用 lapply -

models <- list(c("y1", "t1", "x1"), c("y2", "t2", "x1 + x2"))

lapply(seq_along(models), function(x) 
      run_regressions2(models[[x]][1], models[[x]][2], models[[x]][3]))

或者更一般地使用 do.call 处理任意数量的参数。

lapply(seq_along(models), function(x) do.call(run_regressions2, as.list(models[[x]])))
英文:

As mentioned by @jdobres , plyr has been long retired. There are lot of modern replacements for it like dplyr and purrr package.

One option would be to use lapply in base R. In such case, the first case would change to

models &lt;- list(&quot;y1 ~ t1 + x1&quot;, &quot;y2 ~ t2 + x1 + x2&quot;)
results &lt;- lapply(models, run_regressions)
results

#[[1]]
#   Estimate  Std. Error     t value    Pr(&gt;|t|) 
#-0.42628388  0.20517578 -2.07765211  0.04038316 

#[[2]]
#  Estimate Std. Error    t value   Pr(&gt;|t|) 
#-0.2966993  0.1991987 -1.4894639  0.1396433 

In the second, case you can continue using lapply -

models &lt;- list(c(&quot;y1&quot;, &quot;t1&quot;, &quot;x1&quot;),c(&quot;y2&quot;, &quot;t2&quot;, &quot;x1 + x2&quot;))

lapply(seq_along(models), \(x) 
      run_regressions2(models[[x]][1], models[[x]][2], models[[x]][3]))

Or more generally using do.call for any number of arguments.

lapply(seq_along(models), \(x) do.call(run_regressions2, as.list(models[[x]])))

答案2

得分: 1

一个不需要两次使用 models 的替代方法,与 @Ronak Shah 的最终解决方案类似,可以是:

lapply(lapply(models, as.list), do.call, what=run_regressions2)

内部的 lapplymodels 转换为一个列表的列表,然后将其用作 do.call 的参数。

英文:

An alternative to @Ronak Shah's final solution without having to use models twice would be:

lapply(lapply(models, as.list), do.call, what=run_regressions2)

The inner lapply changes models into a list of lists which are then used as arguments to do.call.

huangapple
  • 本文由 发表于 2023年7月20日 19:21:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76729332.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定