2023年4月20日 02:01:37go评论98阅读模式

英文:

When using lm_robust, and texreg get only the number of observations:

问题

Here's the translated part of your request:

我正在运行4个回归模型，具有不同的规格，但每个模型都将 `x` 作为回归器。我正在使用`estimatr`包中的标准错误类型和来自`estimatr`包的聚类。
然后我使用`texreg`来获得一个格式良好的回归表，这是我的可重现示例：
```R
library(estimatr)
library(texreg)
# 加载数据
data <- data.frame(
  x = rnorm(100),
  z = rnorm(100),
  b = rbinom(100, 1, 0.5),
  n = rpois(100, 10)
)
data$y <- 0.5 + 2 * data$x + 0.1 * data$z + 0.5 * data$b - 0.1 * (data$x - 0.5) ^ 2 + rnorm(100)
# 定义模型
model1 <- lm_robust(y ~ x, data = data, clusters = data$b, se_type = "stata")
model2 <- lm_robust(y ~ x + z, data = data, clusters = data$b, se_type = "stata")
model3 <- lm_robust(y ~ x + b, data = data, clusters = data$z, se_type = "stata")
model4 <- lm_robust(y ~ x + z + b, data = data, clusters = data$n, se_type = "stata")
# 使用screenreg创建表格
texreg(
  list(model1, model2, model3, model4), table = FALSE,
  custom.model.names = c("Model 1", "Model 2", "Model 3", "Model 4"),
  custom.coef.map =  list("x" = "only coeff"),
  include.ci = FALSE, include.adjr = FALSE, include.rsquared = FALSE, inlcude.rmse = FALSE,
  include.nobs = TRUE,
  digits = 3
)

如您在 include.x 部分中所见，我只包括 n.obs。我只需要这个。但似乎代码忽略了这一部分，因为这是我的结果。

\begin{tabular}{l c c c c}
\hline
 & Model 1 & Model 2 & Model 3 & Model 4 \\
\hline
only coeff & $2.394^{*}$ & $2.397^{*}$ & $2.379^{***}$ & $2.382^{***}$ \\
           & $(0.095)$   & $(0.092)$   & $(0.100)$     & $(0.083)$     \\
\hline
R$^2$      & $0.840$     & $0.843$     & $0.855$       & $0.861$       \\
Adj. R$^2$ & $0.838$     & $0.840$     & $0.852$       & $0.857$       \\
Statistic  & $634.115$   & $$          & $282.093$     & $440.925$     \\
P Value    & $0.025$     & $$          & $0.000$       & $0.000$       \\
DF Resid.  & $1.000$     & $1.000$     & $99.000$      & $14.000$      \\
nobs       & $100$       & $100$       & $100$         & $100$         \\
\hline
\multicolumn{5}{l}{\scriptsize{$^{***}p<0.001$; $^{**}p<0.01$; $^{*}p<0.05$}}
\end{tabular}

我认为这与 lm_robust 函数有关（因为它在只有 lm 的情况下可以正常工作），但我确实需要 lm_robust，因为标准误差需要与 stata 中获得的相同。

此外，我收到以下警告：

警告消息：
1: 在 data.frame(gof.names = colnames(out), gof = as.numeric(out)，中引入了 NAs。
2: 在 doTryCatch(return(expr), name, parentenv, handler) 中：
  texreg 用 broom 包提取了以下 GOF 度量，但无法将其转换为数字类型：se_type
3: 在 data.frame(gof.names = colnames(out)，gof = as.numeric(out)，中引入了 NAs。
4: 在 doTryCatch(return(expr), name, parentenv, handler) 中：
  texreg 用 broom 包提取了以下 GOF 度量，但无法将其转换为数字类型：Statistic
5: 在 data.frame(gof.names = colnames(out), gof = as.numeric(out)，中引入了 NAs。
6: 在 doTryCatch(return(expr), name, parentenv, handler) 中：
  texreg 用 broom 包提取了以下 GOF 度量，但无法将其转换为数字类型：se_type
7: 在 data.frame(gof.names = colnames(out), gof = as.numeric(out)，中引入了 NAs。
8: 在 doTryCatch(return(expr), name, parentenv, handler) 中：
  texreg 用 broom 包提取了以下 GOF 度量，但无法将其转换为数字类型：se_type

我希望的结果是：

\begin{tabular}{l c c c c}
\hline
 & Model 1 & Model 2 & Model 3 & Model 4 \\
\hline
only coeff & $2.394^{*}$ & $2.397^{*}$ & $2.379^{***}$ & $2.382^{***}$ \\
           & $(0.095)$   & $(0.092)$   & $(0.100)$     & $(0.083)$     \\
\hline
nobs       & $100$       & $100$       & $100$         & $100$         \\
\hline
\multicolumn{5}{l}{\scriptsize{$^{***}p<0.001$; $^{**}p<0.01$; $^{*}p<0.05$}}
\end{tabular}

如何实现这一点？谢谢！


<details>
<summary>英文:</summary>
I am running 4 regressions, with different specifications, but every model has `x` as a regressor. I am using the standard errors stata type and clusters from the `estimatr` package
Then I am using `texreg` for getting a nicely formatted regression table, here is my reproducible example:

library(estimatr)
library(texreg)

Load data

data <- data.frame(
x = rnorm(100),
z = rnorm(100),
b = rbinom(100, 1, 0.5),
n = rpois(100, 10)
)
data$y <- 0.5 + 2 * data$x + 0.1 * data$z + 0.5 * data$b - 0.1 * (data$x - 0.5) ^ 2 + rnorm(100)

Define models

model1 <- lm_robust(y ~ x, data = data, clusters = data$b, se_type = "stata")
model2 <- lm_robust(y ~ x + z, data = data, clusters = data$b, se_type = "stata")
model3 <- lm_robust(y ~ x + b, data = data, clusters = data$z, se_type = "stata")
model4 <- lm_robust(y ~ x + z + b, data = data, clusters = data$n, se_type = "stata")

Create table with screenreg

texreg(
list(model1, model2, model3, model4), table = FALSE,
custom.model.names = c("Model 1", "Model 2", "Model 3", "Model 4"),
custom.coef.map = list("x" = "only coeff"),
include.ci = FALSE, include.adjr = FALSE, include.rsquared = FALSE, inlcude.rmse = FALSE,
include.nobs = TRUE,
digits = 3
)


As you can see in the `include.x` part I am only including the `n.obs`. I just want that. But it seems the code ignores that part, because this is my result.

\begin{tabular}{l c c c c}
\hline
& Model 1 & Model 2 & Model 3 & Model 4 \
\hline
only coeff & $2.394^{}$ & $2.397^{}$ & $2.379^{}$ & $2.382^{}$ \
& $(0.095)$ & $(0.092)$ & $(0.100)$ & $(0.083)$ \
\hline
R$^2$ & $0.840$ & $0.843$ & $0.855$ & $0.861$ \
Adj. R$^2$ & $0.838$ & $0.840$ & $0.852$ & $0.857$ \
Statistic & $634.115$ & $$ & $282.093$ & $440.925$ \
P Value & $0.025$ & $$ & $0.000$ & $0.000$ \
DF Resid. & $1.000$ & $1.000$ & $99.000$ & $14.000$ \
nobs & $100$ & $100$ & $100$ & $100$ \
\hline
\multicolumn{5}{l}{\scriptsize{$^{}p<0.001$; $^{}p<0.01$; $^{}p<0.05$}}
\end{tabular}


I think it has to do with the `lm_robust` function (because it works with just `lm`), but I really need the `lm_robust` because of standard errors (I need them to be identical to the obtained in stata)
Furthermore I get this warnings:

Warning messages:
1: In data.frame(gof.names = colnames(out), gof = as.numeric(out), :
NAs introduced by coercion
2: In doTryCatch(return(expr), name, parentenv, handler) :
texreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: se_type
3: In data.frame(gof.names = colnames(out), gof = as.numeric(out), :
NAs introduced by coercion
4: In doTryCatch(return(expr), name, parentenv, handler) :
texreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: Statistictexreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: P Valuetexreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: se_type
5: In data.frame(gof.names = colnames(out), gof = as.numeric(out), :
NAs introduced by coercion
6: In doTryCatch(return(expr), name, parentenv, handler) :
texreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: se_type
7: In data.frame(gof.names = colnames(out), gof = as.numeric(out), :
NAs introduced by coercion
8: In doTryCatch(return(expr), name, parentenv, handler) :
texreg used the broom package to extract the following GOF measures, but could not cast them to numeric type: se_type


My desired result would be:

\begin{tabular}{l c c c c}
\hline
& Model 1 & Model 2 & Model 3 & Model 4 \
\hline
only coeff & $2.394^{}$ & $2.397^{}$ & $2.379^{}$ & $2.382^{}$ \
& $(0.095)$ & $(0.092)$ & $(0.100)$ & $(0.083)$ \
\hline
nobs & $100$ & $100$ & $100$ & $100$ \
\hline
\multicolumn{5}{l}{\scriptsize{$^{}p<0.001$; $^{}p<0.01$; $^{}p<0.05$}}
\end{tabular}


How can I accomplish this?
Thanks in advance !
</details>
# 答案1
**得分**: 1
`stargazer`是我用来进行标准误差调整的首选包。以下是创建您想要的表格的示例代码，包括Stata风格的集群标准误差：
```R
library(lmtest)
library(sandwich)
library(stargazer)
set.seed(44)
# 加载数据
data <- data.frame(
  x = rnorm(100),
  z = rnorm(100),
  b = rbinom(100, 1, 0.5),
  n = rpois(100, 10)
)
data$y <- 0.5 + 2 * data$x + 0.1 * data$z + 0.5 * data$b - 0.1 * (data$x - 0.5) ^ 2 + rnorm(100)
# 定义模型
model1 <- lm(y ~ x,     data = data)
model2 <- lm(y ~ x + z, data = data)
model3 <- lm(y ~ x + b, data = data)
model4 <- lm(y ~ x + z + b, data = data)
# 制作集群标准误差
se1 = as.vector(coeftest(model1,vcov = vcovCL,cluster = ~b, type="HC1")[,"Std. Error"])
se2 = as.vector(coeftest(model2,vcov = vcovCL,cluster = ~b,  type="HC1")[,"Std. Error"])
se3 = as.vector(coeftest(model3,vcov = vcovCL,cluster = ~z,  type="HC1")[,"Std. Error"])
se4 = as.vector(coeftest(model4,vcov = vcovCL,cluster = ~n,  type="HC1")[,"Std. Error"])
stargazer(model1,model2,model3,model4,type="latex",
          se=list(se1,se2,se3,se4), omit=c("Constant","z","b"),
          omit.stat = c("f","ser","adj.rsq","rsq"))

如果在stargazer命令中使用type="text"，您将看到以下表格：

================================================
                     Dependent variable:        
             -----------------------------------
                              y                 
               (1)      (2)      (3)      (4)   
------------------------------------------------
x            2.071*** 2.079*** 2.123*** 2.135***
             (0.103)  (0.124)  (0.149)  (0.111) 
                                                
------------------------------------------------
Observations   100      100      100      100   
================================================
Note:                *p<0.1; **p<0.05; ***p<0.01

英文:

stargazer is my go-to package for such standard error adjustments. Here's example code making the table you want, including Stata-style clustered SEs.

library(lmtest)
library(sandwich)
library(stargazer)
set.seed(44)
# Load data
data &lt;- data.frame(
  x = rnorm(100),
  z = rnorm(100),
  b = rbinom(100, 1, 0.5),
  n = rpois(100, 10)
)
data$y &lt;- 0.5 + 2 * data$x + 0.1 * data$z + 0.5 * data$b - 0.1 * (data$x - 0.5) ^ 2 + rnorm(100)
# Define models
model1 &lt;- lm(y ~ x,     data = data)
model2 &lt;- lm(y ~ x + z, data = data)
model3 &lt;- lm(y ~ x + b, data = data)
model4 &lt;- lm(y ~ x + z + b, data = data)
# Make clustered SEs
se1 = as.vector(coeftest(model1,vcov = vcovCL,cluster = ~b, type=&quot;HC1&quot;)[,&quot;Std. Error&quot;])
se2 = as.vector(coeftest(model2,vcov = vcovCL,cluster = ~b,  type=&quot;HC1&quot;)[,&quot;Std. Error&quot;])
se3 = as.vector(coeftest(model3,vcov = vcovCL,cluster = ~z,  type=&quot;HC1&quot;)[,&quot;Std. Error&quot;])
se4 = as.vector(coeftest(model4,vcov = vcovCL,cluster = ~n,  type=&quot;HC1&quot;)[,&quot;Std. Error&quot;])
stargazer(model1,model2,model3,model4,type=&quot;latex&quot;,
          se=list(se1,se2,se3,se4), omit=c(&quot;Constant&quot;,&quot;z&quot;,&quot;b&quot;),
          omit.stat = c(&quot;f&quot;,&quot;ser&quot;,&quot;adj.rsq&quot;,&quot;rsq&quot;))

If you do type="text" in the stargazer command, you'll see this table:

================================================
                     Dependent variable:        
             -----------------------------------
                              y                 
               (1)      (2)      (3)      (4)   
------------------------------------------------
x            2.071*** 2.079*** 2.123*** 2.135***
             (0.103)  (0.124)  (0.149)  (0.111) 
                                                
------------------------------------------------
Observations   100      100      100      100   
================================================
Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

答案2

得分: 1

I figured that it was a typo in the texreg function (facepalm) this is the correct answer (I've changed to screenreg just to make the answer clearer)

这是正确的答案（我改成了screenreg只是为了让答案更清晰）

This is what R displays in the console.

这是R在控制台中显示的内容。

============================================================

Model 1 Model 2 Model 3 Model 4

only coeff 2.118 ** 2.132 ** 2.151 *** 2.164 ***

         (0.023)     (0.006)     (0.101)      (0.148)

Num. obs. 100 100 100 100

N Clusters 2 2 100 15

============================================================

*** p < 0.001; ** p < 0.01; * p < 0.05

您可以通过使用include.nclusts = F来去掉N clusters参数。

英文:

I figured that it was a typo in the texreg function (facepalm) this is the correct answer (I've changed to screenreg just to make the answer clearer)

screenreg(
  list(model1, model2, model3, model4), table = FALSE,
  custom.model.names = c(&quot;Model 1&quot;, &quot;Model 2&quot;, &quot;Model 3&quot;, &quot;Model 4&quot;),
  custom.coef.map =  list(&quot;x&quot; = &quot;only coeff&quot;),
  include.ci = FALSE, include.adjr = FALSE, include.rsquared = FALSE, include.rmse = FALSE,
  include.nobs = TRUE,
  digits = 3
)

This is what R displays in the console.

============================================================
            Model 1     Model 2     Model 3      Model 4    
------------------------------------------------------------
only coeff    2.118 **    2.132 **    2.151 ***    2.164 ***
             (0.023)     (0.006)     (0.101)      (0.148)   
------------------------------------------------------------
Num. obs.   100         100         100          100        
N Clusters    2           2         100           15        
============================================================
*** p &lt; 0.001; ** p &lt; 0.01; * p &lt; 0.05

You could get rid of the N clusters argument by putting include.nclusts = F

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

当使用lm_robust时，以及texreg仅获取观察数：

问题

Load data

Define models

Create table with screenreg

答案2

在数据框中筛选符合多个条件的行。

使用R中的一个向量和同一数据集中的不同变量创建虚拟变量？

在使用R-Markdown创建的Word文档中格式化图标题

在R中，在数据框中按因子水平添加一列比例：

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。