Diff in Diff with panel dataset on R 在R中使用面板数据进行的差异和差异分析

huangapple go评论99阅读模式
英文:

Diff in Diff with panel dataset on R

问题

我有一个面板数据集,我想进行差异分析。现在这是我的回归模型:

  1. fit3 <- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
  2. + post*migrant_category + treated*post*migrant_category + race + educ + age +
  3. marst, data = df, weights = perwt, family = 'gaussian'
  4. )

但这会让R假定每个观察都是相互独立的吗?如果是的话,我应该怎么做才能让R意识到这是一个面板数据集?

英文:

I have a panel dataset that I'd like to conduct diff in diff on. Right now this is my regression:

  1. fit3 &lt;- glm(df$empstat ~ factor(year) + factor(stateicp) + migrant_category + treated*post + treated*migrant_category
  2. + post*migrant_category + treated*post*migrant_category + race + educ + age +
  3. marst, data = df, weights = perwt, family = &#39;gaussian&#39;
  4. )

but will this make R assume that each observation is independent of each other? If yes, what should I do to make R realize that this is a panel data?

答案1

得分: 0

如果您对固定效应模型和差异和差异感兴趣,请使用 plm 软件包。以下是来自Christopher Zorn的示例:

  1. # 面板数据
  2. WDI<-read_csv("https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv")
  3. # 添加“冷战”变量:
  4. WDI$ColdWar <- with(WDI, ifelse(Year < 1990, 1, 0))
  5. # 保留一个数值年份变量(用于 -panelAR-):
  6. WDI$YearNumeric <- WDI$Year
  7. # 将数据转换为面板数据框:
  8. WDI <- pdata.frame(WDI, index = c("ISO3", "Year"))
  9. # 仅提取那些在观察期间的某个时刻实施有薪育儿假政策的国家:
  10. WDI <- WDI %>%
  11. group_by(ISO3) %>%
  12. filter(any(PaidParentalLeave == 1))
  13. # 创建更好的趋势变量:
  14. WDI$Time <- WDI$YearNumeric - 1950
  15. # 固定效应模型...
  16. fe1 <- plm(ChildMortality ~ PaidParentalLeave + Time +
  17. PaidParentalLeave * Time, data = WDI,
  18. effect = "individual", model = "within")
  19. fe2 <- plm(ChildMortality ~ PaidParentalLeave + Time +
  20. PaidParentalLeave * Time + log(GDPPerCapita) +
  21. log(NetAidReceived) + GovtExpenditures,
  22. data = WDI, effect = "individual", model = "within")
  23. fe3 <- plm(ChildMortality ~ PaidParentalLeave + Time +
  24. PaidParentalLeave * Time, data = WDI,
  25. effect = "twoway", model = "within")
  26. fe4 <- plm(ChildMortality ~ PaidParentalLeave + Time +
  27. PaidParentalLeave * Time + log(GDPPerCapita) +
  28. log(NetAidReceived) + GovtExpenditures,
  29. data = WDI, effect = "twoway", model = "within")
  30. # 表格时间
  31. stargazer(fe1, fe2, fe3, fe4,
  32. title = "DiD Models of log(Child Mortality)",
  33. column.separate = c(1, 1, 1), align = TRUE,
  34. dep.var.labels.include = FALSE,
  35. dep.var.caption = "",
  36. covariate.labels = c("Paid Parental Leave", "Time (1950=0)",
  37. "Paid Parental Leave x Time",
  38. "ln(GDP Per Capita)",
  39. "ln(Net Aid Received)",
  40. "Government Expenditures"),
  41. header = FALSE, model.names = FALSE,
  42. model.numbers = FALSE, multicolumn = FALSE,
  43. object.names = TRUE, notes.label = "",
  44. column.sep.width = "-15pt",
  45. omit.stat = c("f", "ser"), type = "text")

DiD模型的log(儿童死亡率)

  1. fe1 fe2 fe3 fe4

Paid Parental Leave -15.500*** -26.200*** -12.500*** -17.300*
(2.420) (7.220) (2.960) (9.360)

Time (1950=0) -0.838*** -1.480***
(0.025) (0.094)

Paid Parental Leave x Time -7.110*** -4.910*
(2.290) (2.600)

ln(GDP Per Capita) -1.780*** -3.020***
(0.471) (0.552)

ln(Net Aid Received) 0.873*** 0.842***
(0.139) (0.146)

Government Expenditures 0.310*** 0.524*** 0.247*** 0.319*
(0.044) (0.128) (0.056) (0.169)


Observations 2,360 622 2,360 622
R2 0.496 0.717 0.009 0.143
Adjusted R2 0.485 0.701 -0.035 0.014

  1. *p<0.1; **p<0.05; ***p<0.01
  1. <details>
  2. <summary>英文:</summary>
  3. If you are interested in fixed effects models and difference in difference, use the `plm` package. Here is an example from Christopher Zorn:
  4. # Panel data
  5. WDI&lt;-read_csv(&quot;https://github.com/PrisonRodeo/GSERM-Ljubljana-APD-git/raw/main/Data/WDI3.csv&quot;)
  6. # Add a &quot;Cold War&quot; variable:
  7. WDI$ColdWar &lt;- with(WDI,ifelse(Year&lt;1990,1,0))
  8. # Keep a numeric year variable (for -panelAR-):
  9. WDI$YearNumeric&lt;-WDI$Year
  10. # Make the data a panel dataframe:
  11. WDI&lt;-pdata.frame(WDI,index=c(&quot;ISO3&quot;,&quot;Year&quot;))
  12. # Pull out *only* those countries that, at some
  13. # point during the observed periods, instituted
  14. # a paid parental leave policy:
  15. WDI&lt;-WDI %&gt;% group_by(ISO3) %&gt;%
  16. filter(any(PaidParentalLeave==1))
  17. # Create a better trend variable:
  18. WDI$Time&lt;-WDI$YearNumeric-1950
  19. # FE models...
  20. fe1&lt;-plm(ChildMortality~PaidParentalLeave+Time+
  21. PaidParentalLeave*Time,data=WDI,
  22. effect=&quot;individual&quot;,model=&quot;within&quot;)
  23. fe2&lt;-plm(ChildMortality~PaidParentalLeave+Time+
  24. PaidParentalLeave*Time+log(GDPPerCapita)+
  25. log(NetAidReceived)+GovtExpenditures,
  26. data=WDI,effect=&quot;individual&quot;,model=&quot;within&quot;)
  27. fe3&lt;-plm(ChildMortality~PaidParentalLeave+Time+
  28. PaidParentalLeave*Time,data=WDI,
  29. effect=&quot;twoway&quot;,model=&quot;within&quot;)
  30. fe4&lt;-plm(ChildMortality~PaidParentalLeave+Time+
  31. PaidParentalLeave*Time+log(GDPPerCapita)+
  32. log(NetAidReceived)+GovtExpenditures,
  33. data=WDI,effect=&quot;twoway&quot;,model=&quot;within&quot;)
  34. # TABLE TIME
  35. stargazer(fe1,fe2,fe3,fe4,
  36. title=&quot;DiD Models of log(Child Mortality)&quot;,
  37. column.separate=c(1,1,1),align=TRUE,
  38. dep.var.labels.include=FALSE,
  39. dep.var.caption=&quot;&quot;,
  40. covariate.labels=c(&quot;Paid Parental Leave&quot;,&quot;Time (1950=0)&quot;,
  41. &quot;Paid Parental Leave x Time&quot;,
  42. &quot;ln(GDP Per Capita)&quot;,
  43. &quot;ln(Net Aid Received)&quot;,
  44. &quot;Government Expenditures&quot;),
  45. header=FALSE,model.names=FALSE,
  46. model.numbers=FALSE,multicolumn=FALSE,
  47. object.names=TRUE,notes.label=&quot;&quot;,
  48. column.sep.width=&quot;-15pt&quot;,
  49. omit.stat=c(&quot;f&quot;,&quot;ser&quot;),type=&quot;text&quot;)
  50. DiD Models of log(Child Mortality)
  51. =====================================================================
  52. fe1 fe2 fe3 fe4
  53. ---------------------------------------------------------------------
  54. Paid Parental Leave -15.500*** -26.200*** -12.500*** -17.300*
  55. (2.420) (7.220) (2.960) (9.360)
  56. Time (1950=0) -0.838*** -1.480***
  57. (0.025) (0.094)
  58. Paid Parental Leave x Time -7.110*** -4.910*
  59. (2.290) (2.600)
  60. ln(GDP Per Capita) -1.780*** -3.020***
  61. (0.471) (0.552)
  62. ln(Net Aid Received) 0.873*** 0.842***
  63. (0.139) (0.146)
  64. Government Expenditures 0.310*** 0.524*** 0.247*** 0.319*
  65. (0.044) (0.128) (0.056) (0.169)
  66. ---------------------------------------------------------------------
  67. Observations 2,360 622 2,360 622
  68. R2 0.496 0.717 0.009 0.143
  69. Adjusted R2 0.485 0.701 -0.035 0.014
  70. =====================================================================
  71. *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01
  72. </details>

huangapple
  • 本文由 发表于 2023年1月6日 12:59:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027086.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定