Diff in Diff with panel dataset on R 在R中使用面板数据进行的差异和差异分析

huangapple go评论62阅读模式
英文:

Diff in Diff with panel dataset on R

问题

我有一个面板数据集,我想进行差异分析。现在这是我的回归模型:

fit3 <- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
           + post*migrant_category + treated*post*migrant_category + race + educ + age +
             marst, data = df, weights = perwt, family = 'gaussian'
)

但这会让R假定每个观察都是相互独立的吗?如果是的话,我应该怎么做才能让R意识到这是一个面板数据集?

英文:

I have a panel dataset that I'd like to conduct diff in diff on. Right now this is my regression:

fit3 &lt;- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
           + post*migrant_category + treated*post*migrant_category + race + educ + age +
             marst, data = df, weights = perwt, family = &#39;gaussian&#39;
)

but will this make R assume that each observation is independent of each other? If yes, what should I do to make R realize that this is a panel data?

答案1

得分: 0

如果您对固定效应模型和差异和差异感兴趣,请使用 plm 软件包。以下是来自Christopher Zorn的示例:

# 面板数据
WDI<-read_csv("https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv")

# 添加“冷战”变量:
WDI$ColdWar <- with(WDI, ifelse(Year < 1990, 1, 0))

# 保留一个数值年份变量(用于 -panelAR-):
WDI$YearNumeric <- WDI$Year

# 将数据转换为面板数据框:
WDI <- pdata.frame(WDI, index = c("ISO3", "Year"))

# 仅提取那些在观察期间的某个时刻实施有薪育儿假政策的国家:
WDI <- WDI %>%
  group_by(ISO3) %>%
  filter(any(PaidParentalLeave == 1))

# 创建更好的趋势变量:
WDI$Time <- WDI$YearNumeric - 1950

# 固定效应模型...

fe1 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time, data = WDI,
           effect = "individual", model = "within")

fe2 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time + log(GDPPerCapita) +
           log(NetAidReceived) + GovtExpenditures,
           data = WDI, effect = "individual", model = "within")

fe3 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time, data = WDI,
           effect = "twoway", model = "within")

fe4 <- plm(ChildMortality ~ PaidParentalLeave + Time +
           PaidParentalLeave * Time + log(GDPPerCapita) +
           log(NetAidReceived) + GovtExpenditures,
           data = WDI, effect = "twoway", model = "within")

# 表格时间

stargazer(fe1, fe2, fe3, fe4,
          title = "DiD Models of log(Child Mortality)",
          column.separate = c(1, 1, 1), align = TRUE,
          dep.var.labels.include = FALSE,
          dep.var.caption = "",
          covariate.labels = c("Paid Parental Leave", "Time (1950=0)",
                               "Paid Parental Leave x Time",
                               "ln(GDP Per Capita)",
                               "ln(Net Aid Received)",
                               "Government Expenditures"),
          header = FALSE, model.names = FALSE,
          model.numbers = FALSE, multicolumn = FALSE,
          object.names = TRUE, notes.label = "",
          column.sep.width = "-15pt",
          omit.stat = c("f", "ser"), type = "text")

DiD模型的log(儿童死亡率)

                              fe1        fe2        fe3        fe4   

Paid Parental Leave -15.500*** -26.200*** -12.500*** -17.300*
(2.420) (7.220) (2.960) (9.360)

Time (1950=0) -0.838*** -1.480***
(0.025) (0.094)

Paid Parental Leave x Time -7.110*** -4.910*
(2.290) (2.600)

ln(GDP Per Capita) -1.780*** -3.020***
(0.471) (0.552)

ln(Net Aid Received) 0.873*** 0.842***
(0.139) (0.146)

Government Expenditures 0.310*** 0.524*** 0.247*** 0.319*
(0.044) (0.128) (0.056) (0.169)


Observations 2,360 622 2,360 622
R2 0.496 0.717 0.009 0.143
Adjusted R2 0.485 0.701 -0.035 0.014

                                      *p<0.1; **p<0.05; ***p<0.01

<details>
<summary>英文:</summary>
If you are interested in fixed effects models and difference in difference, use the `plm` package. Here is an example from Christopher Zorn:
# Panel data 
WDI&lt;-read_csv(&quot;https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv&quot;)
# Add a &quot;Cold War&quot; variable:
WDI$ColdWar &lt;- with(WDI,ifelse(Year&lt;1990,1,0))
# Keep a numeric year variable (for -panelAR-):
WDI$YearNumeric&lt;-WDI$Year
# Make the data a panel dataframe:
WDI&lt;-pdata.frame(WDI,index=c(&quot;ISO3&quot;,&quot;Year&quot;))
# Pull out *only* those countries that, at some
# point during the observed periods, instituted
# a paid parental leave policy:
WDI&lt;-WDI %&gt;% group_by(ISO3) %&gt;%
filter(any(PaidParentalLeave==1))
# Create a better trend variable:
WDI$Time&lt;-WDI$YearNumeric-1950
# FE models...
fe1&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time,data=WDI,
effect=&quot;individual&quot;,model=&quot;within&quot;)
fe2&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time+log(GDPPerCapita)+
log(NetAidReceived)+GovtExpenditures,
data=WDI,effect=&quot;individual&quot;,model=&quot;within&quot;)
fe3&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time,data=WDI,
effect=&quot;twoway&quot;,model=&quot;within&quot;)
fe4&lt;-plm(ChildMortality~PaidParentalLeave+Time+
PaidParentalLeave*Time+log(GDPPerCapita)+
log(NetAidReceived)+GovtExpenditures,
data=WDI,effect=&quot;twoway&quot;,model=&quot;within&quot;)
# TABLE TIME
stargazer(fe1,fe2,fe3,fe4,
title=&quot;DiD Models of log(Child Mortality)&quot;,
column.separate=c(1,1,1),align=TRUE,
dep.var.labels.include=FALSE,
dep.var.caption=&quot;&quot;,
covariate.labels=c(&quot;Paid Parental Leave&quot;,&quot;Time (1950=0)&quot;,
&quot;Paid Parental Leave x Time&quot;,
&quot;ln(GDP Per Capita)&quot;,
&quot;ln(Net Aid Received)&quot;,
&quot;Government Expenditures&quot;),
header=FALSE,model.names=FALSE,
model.numbers=FALSE,multicolumn=FALSE,
object.names=TRUE,notes.label=&quot;&quot;,
column.sep.width=&quot;-15pt&quot;,
omit.stat=c(&quot;f&quot;,&quot;ser&quot;),type=&quot;text&quot;)
DiD Models of log(Child Mortality)
=====================================================================
fe1        fe2        fe3        fe4   
---------------------------------------------------------------------
Paid Parental Leave        -15.500*** -26.200*** -12.500*** -17.300* 
(2.420)    (7.220)    (2.960)    (9.360) 
Time (1950=0)              -0.838***  -1.480***                      
(0.025)    (0.094)                       
Paid Parental Leave x Time            -7.110***              -4.910* 
(2.290)               (2.600) 
ln(GDP Per Capita)                    -1.780***             -3.020***
(0.471)               (0.552) 
ln(Net Aid Received)                   0.873***             0.842*** 
(0.139)               (0.146) 
Government Expenditures     0.310***   0.524***   0.247***   0.319*  
(0.044)    (0.128)    (0.056)    (0.169) 
---------------------------------------------------------------------
Observations                 2,360       622       2,360       622   
R2                           0.496      0.717      0.009      0.143  
Adjusted R2                  0.485      0.701      -0.035     0.014  
=====================================================================
*p&lt;0.1; **p&lt;0.05; ***p&lt;0.01
</details>

huangapple
  • 本文由 发表于 2023年1月6日 12:59:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027086.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定