2023年1月6日 12:59:21go评论99阅读模式

英文:

Diff in Diff with panel dataset on R

问题

我有一个面板数据集，我想进行差异分析。现在这是我的回归模型：

fit3 <- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
           + post*migrant_category + treated*post*migrant_category + race + educ + age +
             marst, data = df, weights = perwt, family = 'gaussian'
)

但这会让R假定每个观察都是相互独立的吗？如果是的话，我应该怎么做才能让R意识到这是一个面板数据集？

英文:

I have a panel dataset that I'd like to conduct diff in diff on. Right now this is my regression:

fit3 &lt;- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
           + post*migrant_category + treated*post*migrant_category + race + educ + age +
             marst, data = df, weights = perwt, family = &#39;gaussian&#39;
)

but will this make R assume that each observation is independent of each other? If yes, what should I do to make R realize that this is a panel data?

答案1

得分: 0

如果您对固定效应模型和差异和差异感兴趣，请使用 plm 软件包。以下是来自Christopher Zorn的示例：

# 面板数据
WDI<-read_csv("https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv")
# 添加“冷战”变量：
WDI$ColdWar <- with(WDI, ifelse(Year < 1990, 1, 0))
# 保留一个数值年份变量（用于 -panelAR-）：
WDI$YearNumeric <- WDI$Year
# 将数据转换为面板数据框：
WDI <- pdata.frame(WDI, index = c("ISO3", "Year"))
# 仅提取那些在观察期间的某个时刻实施有薪育儿假政策的国家：
WDI <- WDI %>%
  group_by(ISO3) %>%
  filter(any(PaidParentalLeave == 1))
# 创建更好的趋势变量：
WDI$Time <- WDI$YearNumeric - 1950
# 固定效应模型...
fe1 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time, data = WDI,
           effect = "individual", model = "within")
fe2 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time + log(GDPPerCapita) +
           log(NetAidReceived) + GovtExpenditures,
           data = WDI, effect = "individual", model = "within")
fe3 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time, data = WDI,
           effect = "twoway", model = "within")
fe4 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time + log(GDPPerCapita) +
           log(NetAidReceived) + GovtExpenditures,
           data = WDI, effect = "twoway", model = "within")
# 表格时间
stargazer(fe1, fe2, fe3, fe4,
          title = "DiD Models of log(Child Mortality)",
          column.separate = c(1, 1, 1), align = TRUE,
          dep.var.labels.include = FALSE,
          dep.var.caption = "",
          covariate.labels = c("Paid Parental Leave", "Time (1950=0)",
                               "Paid Parental Leave x Time",
                               "ln(GDP Per Capita)",
                               "ln(Net Aid Received)",
                               "Government Expenditures"),
          header = FALSE, model.names = FALSE,
          model.numbers = FALSE, multicolumn = FALSE,
          object.names = TRUE, notes.label = "",
          column.sep.width = "-15pt",
          omit.stat = c("f", "ser"), type = "text")

DiD模型的log(儿童死亡率)

                              fe1        fe2        fe3        fe4

Paid Parental Leave -15.500*** -26.200*** -12.500*** -17.300*
(2.420) (7.220) (2.960) (9.360)

Time (1950=0) -0.838*** -1.480***
(0.025) (0.094)

Paid Parental Leave x Time -7.110*** -4.910*
(2.290) (2.600)

ln(GDP Per Capita) -1.780*** -3.020***
(0.471) (0.552)

ln(Net Aid Received) 0.873*** 0.842***
(0.139) (0.146)

Government Expenditures 0.310*** 0.524*** 0.247*** 0.319*
(0.044) (0.128) (0.056) (0.169)

Observations 2,360 622 2,360 622
R2 0.496 0.717 0.009 0.143
Adjusted R2 0.485 0.701 -0.035 0.014

                                      *p<0.1; **p<0.05; ***p<0.01


<details>
<summary>英文:</summary>
If you are interested in fixed effects models and difference in difference, use the `plm` package. Here is an example from Christopher Zorn:
# Panel data 
WDI&lt;-read_csv(&quot;https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv&quot;)
# Add a &quot;Cold War&quot; variable:
WDI$ColdWar &lt;- with(WDI,ifelse(Year&lt;1990,1,0))
# Keep a numeric year variable (for -panelAR-):
WDI$YearNumeric&lt;-WDI$Year
# Make the data a panel dataframe:
WDI&lt;-pdata.frame(WDI,index=c(&quot;ISO3&quot;,&quot;Year&quot;))
# Pull out *only* those countries that, at some
# point during the observed periods, instituted
# a paid parental leave policy:
WDI&lt;-WDI %&gt;% group_by(ISO3) %&gt;%
filter(any(PaidParentalLeave==1))
# Create a better trend variable:
WDI$Time&lt;-WDI$YearNumeric-1950
# FE models...
fe1&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time,data=WDI,
effect=&quot;individual&quot;,model=&quot;within&quot;)
fe2&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time+log(GDPPerCapita)+
log(NetAidReceived)+GovtExpenditures,
data=WDI,effect=&quot;individual&quot;,model=&quot;within&quot;)
fe3&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time,data=WDI,
effect=&quot;twoway&quot;,model=&quot;within&quot;)
fe4&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time+log(GDPPerCapita)+
log(NetAidReceived)+GovtExpenditures,
data=WDI,effect=&quot;twoway&quot;,model=&quot;within&quot;)
# TABLE TIME
stargazer(fe1,fe2,fe3,fe4,
title=&quot;DiD Models of log(Child Mortality)&quot;,
column.separate=c(1,1,1),align=TRUE,
dep.var.labels.include=FALSE,
dep.var.caption=&quot;&quot;,
covariate.labels=c(&quot;Paid Parental Leave&quot;,&quot;Time (1950=0)&quot;,
&quot;Paid Parental Leave x Time&quot;,
&quot;ln(GDP Per Capita)&quot;,
&quot;ln(Net Aid Received)&quot;,
&quot;Government Expenditures&quot;),
header=FALSE,model.names=FALSE,
model.numbers=FALSE,multicolumn=FALSE,
object.names=TRUE,notes.label=&quot;&quot;,
column.sep.width=&quot;-15pt&quot;,
omit.stat=c(&quot;f&quot;,&quot;ser&quot;),type=&quot;text&quot;)
DiD Models of log(Child Mortality)
=====================================================================
fe1        fe2        fe3        fe4   
---------------------------------------------------------------------
Paid Parental Leave        -15.500*** -26.200*** -12.500*** -17.300* 
(2.420)    (7.220)    (2.960)    (9.360) 
Time (1950=0)              -0.838***  -1.480***                      
(0.025)    (0.094)                       
Paid Parental Leave x Time            -7.110***              -4.910* 
(2.290)               (2.600) 
ln(GDP Per Capita)                    -1.780***             -3.020***
(0.471)               (0.552) 
ln(Net Aid Received)                   0.873***             0.842*** 
(0.139)               (0.146) 
Government Expenditures     0.310***   0.524***   0.247***   0.319*  
(0.044)    (0.128)    (0.056)    (0.169) 
---------------------------------------------------------------------
Observations                 2,360       622       2,360       622   
R2                           0.496      0.717      0.009      0.143  
Adjusted R2                  0.485      0.701      -0.035     0.014  
=====================================================================
*p&lt;0.1; **p&lt;0.05; ***p&lt;0.01
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Diff in Diff with panel dataset on R 在R中使用面板数据进行的差异和差异分析

问题

答案1

DiD模型的log(儿童死亡率)

Observations 2,360 622 2,360 622
R2 0.496 0.717 0.009 0.143
Adjusted R2 0.485 0.701 -0.035 0.014

根据多个标准创建分类

如何在ggplot个人函数之间传递变量？

创建一个从两个数据集中生成的’表格’。

将神经网络的概率转化为R中的预测

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论

问题

答案1

DiD模型的log(儿童死亡率)

Observations 2,360 622 2,360 622 R2 0.496 0.717 0.009 0.143 Adjusted R2 0.485 0.701 -0.035 0.014

发表评论

Observations 2,360 622 2,360 622
R2 0.496 0.717 0.009 0.143
Adjusted R2 0.485 0.701 -0.035 0.014