Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

huangapple go评论67阅读模式
英文:

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

问题

我正在R中复现一些刊物上发布的生存分析结果。原始结果是在Stata中生成的。以下是原始结果:Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?。以下是我在R中用于复现这些结果的代码。

# 载入所需包
library(dplyr)
library(foreign)
library(stargazer)

# 载入Svolik的原始数据
data = read_stata("leaders, institutions, covariates, updated tvc.dta")

# 为每行设置t0
data = mutate(data, t0 = lag(t, default = 0), .by = leadid)

# 原始政变生存对象
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)

# 原始政变模型
coups_original <- coxph(survobj_coup ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age, 
                        data = data, ties = "breslow")

# 原始叛乱生存对象
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)

# 原始叛乱模型
revolt_original <- coxph(survobj_revolt ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + mil + cw + age, 
                          data = data, ties = "breslow")

# 原始自然生存对象
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)

# 原始自然模型
natural_original <- coxph(survobj_natural ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age, 
                          data = data, ties = "breslow")

# 定义指数化系数的函数
exp_coef <- function(x) { round(exp(x), 3) }

# 使用stargazer创建表格
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)

这是输出结果。请查看模型1、3和5(忽略其他模型)。注意,虽然系数是相同的,但星号都不正常。有人知道发生了什么吗?有趣的是,当我在不对系数进行指数化的情况下呈现结果(即,当我不以风险比例的形式呈现结果时),星号是正确的。但我需要以风险比例的形式呈现结果。如果有任何建议,将不胜感激!

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

英文:

I am reproducing in R some survival analysis results published in a journal. The original results were produced in Stata. Here are the original results: Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?. Here is the code I use to reproduce these results in R.

# load packages
library(dplyr)
library(foreign)
library(stargazer)

# load Svolik&#39;s original data 
data = read_stata(&quot;leaders, institutions, covariates, updated tvc.dta&quot;)

# set a t0 for each row
data = mutate(data,t0 = lag(t,default=0), .by=leadid)

# coup survival object original
survobj_coup = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_coup)

# coup model original
coups_original &lt;- coxph(survobj_coup~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
      data=data, ties=&quot;breslow&quot;)

# revolt survival object original 
survobj_revolt = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_revolt)

# revolt model original 
revolt_original &lt;- coxph(survobj_revolt~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ mil+ cw+ age, 
                        data=data, ties=&quot;breslow&quot;)

# natural survival object original
survobj_natural = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_natural)

# natural model original
natural_original &lt;- coxph(survobj_natural~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
                        data=data, ties=&quot;breslow&quot;)

# Define a function to exponentiate coefficients
exp_coef &lt;- function(x) { round(exp(x), 3) }

# Create the table using stargazer
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)

Here is the output. Look at Models 1, 3, and 5 (disregard the others). Notice that while the coefficients are the same, the stars are all wonky. Does anyone know what's going on? Interestingly, when I present the results without exponentiating the coefficients (i.e., when I don't present the results in terms of hazard ratios), the stars are correct. But I need to present the results in terms of hazard ratios. Any advice would be appreciated!

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

答案1

得分: 1

stargazer()函数有一个名为p.auto的参数,默认值为TRUE。文档中说明:

一个逻辑值,指示如果用户提供了系数或标准误差(从参数coefse中)或者通过函数修改了它们(从参数apply.coefapply.se中),stargazer是否应该计算p值,使用标准正态分布。换句话说,如果你转换了系数,默认行为是基于标准正态分布重新计算转换后的系数和未转换的标准误差的p值。换句话说:毫无意义。

要保留coxph()计算的p值,您需要指定p.auto = FALSE。以下是一个可重现的示例:

library(survival)
library(stargazer) |&gt; suppressPackageStartupMessages()

fit &lt;- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#&gt; Call:
#&gt; coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps), 
#&gt;     data = ovarian)
#&gt; 
#&gt;                     coef exp(coef) se(coef)     z       p
#&gt; age              0.16150   1.17527  0.04992 3.235 0.00122
#&gt; factor(ecog.ps)2 0.01866   1.01884  0.59908 0.031 0.97515
#&gt; 
#&gt; Likelihood ratio test=14.29  on 2 df, p=0.000787
#&gt; n= 26, number of events= 12

stargazer报告的p值指标不匹配:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2               1.019*           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

...除非您明确告诉它不要重新计算它们:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;, p.auto = FALSE)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2                1.019           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01
英文:

The stargazer() function has an argument called p.auto which defaults to
TRUE. The documentation states:

> a logical value that indicates whether stargazer should calculate the p-values,
> using the standard normal distribution, if coefficients or standard errors are
> supplied by the user (from arguments coef and se) or modified by a function
> (from arguments apply.coef or apply.se).

That is, if you transform the coefficients the default behaviour is to recalculate
p-values from the transformed coefficients and untransformed standard errors
based on the standard normal distribution. In other words: nonsense.

You’ll need to specify p.auto = FALSE to keep using the p-values calculated by coxph().
Here’s a reproducible example:

library(survival)
library(stargazer) |&gt; suppressPackageStartupMessages()

fit &lt;- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#&gt; Call:
#&gt; coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps), 
#&gt;     data = ovarian)
#&gt; 
#&gt;                     coef exp(coef) se(coef)     z       p
#&gt; age              0.16150   1.17527  0.04992 3.235 0.00122
#&gt; factor(ecog.ps)2 0.01866   1.01884  0.59908 0.031 0.97515
#&gt; 
#&gt; Likelihood ratio test=14.29  on 2 df, p=0.000787
#&gt; n= 26, number of events= 12

stargazer reported p-value indicators don’t match:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2               1.019*           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

… unless you specifically tell it not to recalculate them:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;, p.auto = FALSE)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2                1.019           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

huangapple
  • 本文由 发表于 2023年6月9日 04:31:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76435497.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定