为什么在估计包装在函数中时,更新方法不起作用?

huangapple go评论67阅读模式
英文:

Why does update method does not work when estimation is wrapped in function?

问题

显然,如果我将估计函数包装在另一个函数中,update() 方法无法检索估计基于的数据集。是否有绕过这个问题的方法,例如通过指定环境?

library(fixest)
data(trade)

# 直接拟合模型并包装成函数
mod1 <- fepois(Euros ~ log(dist_km) | Origin + Destination, trade)

fit_model <- function(df) {
  fepois(Euros ~ log(dist_km) | Origin + Destination, data = df)
}

mod2 <- fit_model(trade)

# 尝试更新
update(mod1, . ~ . + log(Year))
# > 泊松估计,因变量:Euros
# > 观察数:38,325 
# > 固定效应:Origin: 15,  Destination: 15
# > 标准误差:聚类(Origin) 
# >              估计值  标准误差  t 值  Pr(>|t|)    
# > log(dist_km) -1.51756   0.113171 -13.4095 < 2.2e-16 ***
# > log(Year)    72.36888   6.899699  10.4887 < 2.2e-16 ***
# > ---
# > 显著性代码: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# > 对数似然:-1.212e+12   调整伪 R2:0.592897
# >            BIC: 2.424e+12     平方相关性:0.384441
update(mod2, . ~ . + log(Year))
# > 错误:fepois(fml = Euros ~ log(dist_km) + log(Year) | Origin + Destination, : 参数 'data' 必须是:i) 一个矩阵,或 ii) 一个数据框。
# > 问题:它既不是一个矩阵也不是一个数据框(而是一个函数)。

创建于2023-02-26,使用reprex v2.0.2

也发布在GitHub问题

更新:解决方法似乎是强制提前评估引用数据集的表达式。另一种方法是在update()中再次指定数据集:

update(mod2, . ~ . + log(Year), data = trade)
英文:

Apparently the update() method cannot retrieve the dataset the estimation was based on if I wrap the estimation function in another function. Is there any way around this, e.g., by specifying an environment?

library(fixest)
data(trade)

# fit model directly and wrapped into function
mod1 &lt;- fepois(Euros ~ log(dist_km) | Origin + Destination, trade)

fit_model &lt;- function(df) {
  fepois(Euros ~ log(dist_km) | Origin + Destination, data = df)
}

mod2 &lt;- fit_model(trade)

# try to update
update(mod1, . ~ . + log(Year))
#&gt; Poisson estimation, Dep. Var.: Euros
#&gt; Observations: 38,325 
#&gt; Fixed-effects: Origin: 15,  Destination: 15
#&gt; Standard-errors: Clustered (Origin) 
#&gt;              Estimate Std. Error  t value  Pr(&gt;|t|)    
#&gt; log(dist_km) -1.51756   0.113171 -13.4095 &lt; 2.2e-16 ***
#&gt; log(Year)    72.36888   6.899699  10.4887 &lt; 2.2e-16 ***
#&gt; ---
#&gt; Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
#&gt; Log-Likelihood: -1.212e+12   Adj. Pseudo R2: 0.592897
#&gt;            BIC:  2.424e+12     Squared Cor.: 0.384441
update(mod2, . ~ . + log(Year))
#&gt; Error in fepois(fml = Euros ~ log(dist_km) + log(Year) | Origin + Destination, : Argument &#39;data&#39; must be either: i) a matrix, or ii) a data.frame.
#&gt; Problem: it is not a matrix nor a data.frame (instead it is a function).

<sup>Created on 2023-02-26 with reprex v2.0.2</sup>

Also posted as a GitHub issue.

Update: The solution seems to be forcing an early evaluation of the expression that refers to the dataset. Another way is to specify the dataset again within update():

update(mod2, . ~ . + log(Year), data = trade)

答案1

得分: 2

以下是代码部分的翻译:

如果您想将任意的 df 传递给函数而不硬编码 trade,我们需要在调用 fepois() 之前提前评估它。我们可以使用 eval(bquote()) 来做到这一点,并将数据参数(在 mydat 下方)包装在 .() 中。为了更好地捕获对象名称,我们还可以在提前评估之前将数据参数包装在 substitute() 中(感谢 @jay.sf 的评论)。

更新:现在我添加了一个 env 参数,需要在 purrr::map() 和类似函数中使用时指定为 parent.frame()

以下是代码中的翻译:

library(fixest)
library(tidyverse)
data(trade)

fit_model <- function(mydat, env = environment()) {
  eval(bquote(fepois(Euros ~ log(dist_km) | Origin + Destination, data = .(substitute(mydat, env = env))))
}

mod2 <- fit_model(trade)

update(mod2, . ~ . + log(Year))
# > 泊松估计,依赖变量:Euros
# > 观测数:38,325 
# > 固定效应:起始点:15,  目的地:15
# > 标准误差:集群(起始点) 
# >              估计  标准误差  t 值  Pr(>|t|)    
# > log(dist_km) -1.51756   0.113171 -13.4095 < 2.2e-16 ***
# > log(Year)    72.36888   6.899699  10.4887 < 2.2e-16 ***
# > ---
# > 显著性标志: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# > 对数似然:-1.212e+12   调整伪R^2:0.592897
# > BIC:  2.424e+12     平方相关:0.384441

mod2$call
# > fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = trade)

res <- trade |>
  nest(.by = Year) |>
  mutate(fit = map(data, \(x) fit_model(x, parent.frame())))

res$fit[[1]]
# > 泊松估计,依赖变量:Euros
# > 观测数:3,793 
# > 固定效应:起始点:15,  目的地:15
# > 标准误差:集群(起始点) 
# >              估计  标准误差  t 值  Pr(>|t|)    
# > log(dist_km) -1.48073   0.114878 -12.8896 < 2.2e-16 ***
# > ---
# > 显著性标志: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# > 对数似然:-1.082e+11   调整伪R^2:0.573982
# > BIC:  2.164e+11     平方相关:0.352497

res$fit[[1]]$call
# > fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = mydat)

于 2023-03-07 由 reprex 包 创建

英文:

If you want to pass an arbitrary df into the function and not hard code trade we have to evaluate it early before calling fepois(). We can do this with eval(bquote()) and wrap the data argument (below mydat) into .(). To capture the object name nicely, we can further wrap the data argument in substitute() before evaluating it early (thanks for the comment from @jay.sf).

Update: I now added an env argument which needs to be specified with parent.frame() when used inside purrr::map() and similar functions.

library(fixest)
library(tidyverse)
data(trade)

fit_model &lt;- function(mydat, env = environment()) {
  eval(bquote(fepois(Euros ~ log(dist_km) | Origin + Destination, data = .(substitute(mydat, env = env)))))
}

mod2 &lt;- fit_model(trade)

update(mod2, . ~ . + log(Year))
#&gt; Poisson estimation, Dep. Var.: Euros
#&gt; Observations: 38,325 
#&gt; Fixed-effects: Origin: 15,  Destination: 15
#&gt; Standard-errors: Clustered (Origin) 
#&gt;              Estimate Std. Error  t value  Pr(&gt;|t|)    
#&gt; log(dist_km) -1.51756   0.113171 -13.4095 &lt; 2.2e-16 ***
#&gt; log(Year)    72.36888   6.899699  10.4887 &lt; 2.2e-16 ***
#&gt; ---
#&gt; Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
#&gt; Log-Likelihood: -1.212e+12   Adj. Pseudo R2: 0.592897
#&gt;            BIC:  2.424e+12     Squared Cor.: 0.384441

mod2$call
#&gt; fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = trade)

res &lt;- trade |&gt;
  nest(.by = Year) |&gt;
  mutate(fit = map(data, \(x) fit_model(x, parent.frame())))

res$fit[[1]]
#&gt; Poisson estimation, Dep. Var.: Euros
#&gt; Observations: 3,793 
#&gt; Fixed-effects: Origin: 15,  Destination: 15
#&gt; Standard-errors: Clustered (Origin) 
#&gt;              Estimate Std. Error  t value  Pr(&gt;|t|)    
#&gt; log(dist_km) -1.48073   0.114878 -12.8896 &lt; 2.2e-16 ***
#&gt; ---
#&gt; Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
#&gt; Log-Likelihood: -1.082e+11   Adj. Pseudo R2: 0.573982
#&gt;            BIC:  2.164e+11     Squared Cor.: 0.352497

res$fit[[1]]$call
#&gt; fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = mydat)

<sup>Created on 2023-03-07 by the reprex package (v2.0.1)</sup>

答案2

得分: 1

问题是,调用看起来像这样:

mod2$call
# fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = df)

其中数据应该是 data = trade

您可以使用 eval-parse 方法。有点巧妙,但有效。

fit_model2 <- function(df) {
  eval(parse(text=sprintf('fepois(Euros ~ log(dist_km) | Origin + Destination, data = %s)', 
                          deparse(substitute(df))))
}

mod2a <- fit_model2(trade)
mod2a$call
# fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = trade)

update(mod2a, . ~ . + log(Year))
# 泊松估计, 因变量: Euros
# 观测数: 38,325
# 固定效应: Origin: 15,  Destination: 15
# 标准误: 聚类(Origin)
# 估计  标准误  t值  Pr(>|t|)    
# log(dist_km) -1.51756   0.113171 -13.4095 < 2.2e-16 ***
# log(Year)    72.36888   6.899699  10.4887 < 2.2e-16 ***
# ---
# 显著性代码: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 对数似然值: -1.212e+12   调整伪R2: 0.592897
# BIC:  2.424e+12     方差比值: 0.384441

请注意,我已经跳过了代码部分的翻译,只提供了翻译好的文本。

英文:

The problem is, that the call looks like

mod2$call
# fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = df)

where data should be data = trade.

You could use an eval-parse approach. A little hacky, but works.

fit_model2 &lt;- function(df) {
  eval(parse(text=sprintf(&#39;fepois(Euros ~ log(dist_km) | Origin + Destination, data = %s)&#39;, 
                          deparse(substitute(df)))))
}

mod2a &lt;- fit_model2(trade)
mod2a$call
# fepois(fml = Euros ~ log(dist_km) | Origin + Destination, data = trade)

update(mod2a, . ~ . + log(Year))
# Poisson estimation, Dep. Var.: Euros
# Observations: 38,325 
# Fixed-effects: Origin: 15,  Destination: 15
# Standard-errors: Clustered (Origin) 
# Estimate Std. Error  t value  Pr(&gt;|t|)    
# log(dist_km) -1.51756   0.113171 -13.4095 &lt; 2.2e-16 ***
# log(Year)    72.36888   6.899699  10.4887 &lt; 2.2e-16 ***
# ---
# Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
# Log-Likelihood: -1.212e+12   Adj. Pseudo R2: 0.592897
#            BIC:  2.424e+12     Squared Cor.: 0.384441

答案3

得分: 0

尝试在 fit_model 函数中将 df 替换为 trade,因为 fepois 不会像这样识别 df 数据:

fit_model <- function(trade) {
  fepois(Euros ~ log(dist_km) | Origin + Destination, data = trade)
}

mod2 <- fit_model(trade)

update(mod2, . ~ . + log(Year))

泊松估计,依赖变量:欧元
观测次数:38,325
固定效应:出发地:15,目的地:15
标准误差:集群(出发地)
                 估计值 标准误差  t 值  Pr(>|t|)
log(dist_km) -1.51756   0.113171 -13.4095 < 2.2e-16 ***
log(Year)    72.36888   6.899699  10.4887 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
对数似然:-1.212e+12 调整伪 R2:0.592897
BIC:2.424e+12 平方相关性:0.384441
英文:

Try to replace df with trade in the fit_model function as the fepois doesnt recognize the df data like this :

fit_model &lt;- function(trade) {
  fepois(Euros ~ log(dist_km) | Origin + Destination, data = trade)
}

mod2 &lt;- fit_model(trade)

update(mod2, . ~ . + log(Year))

Poisson estimation, Dep. Var.: Euros
Observations: 38,325 
Fixed-effects: Origin: 15,  Destination: 15
Standard-errors: Clustered (Origin) 
             Estimate Std. Error  t value  Pr(&gt;|t|)    
log(dist_km) -1.51756   0.113171 -13.4095 &lt; 2.2e-16 ***
log(Year)    72.36888   6.899699  10.4887 &lt; 2.2e-16 ***
---
Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
Log-Likelihood: -1.212e+12   Adj. Pseudo R2: 0.592897
           BIC:  2.424e+12     Squared Cor.: 0.384441

huangapple
  • 本文由 发表于 2023年3月3日 18:32:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75625930.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定