英文:
Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?
问题
我正在R
中复现一些刊物上发布的生存分析结果。原始结果是在Stata中生成的。以下是原始结果:。以下是我在R
中用于复现这些结果的代码。
# 载入所需包
library(dplyr)
library(foreign)
library(stargazer)
# 载入Svolik的原始数据
data = read_stata("leaders, institutions, covariates, updated tvc.dta")
# 为每行设置t0
data = mutate(data, t0 = lag(t, default = 0), .by = leadid)
# 原始政变生存对象
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)
# 原始政变模型
coups_original <- coxph(survobj_coup ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age,
data = data, ties = "breslow")
# 原始叛乱生存对象
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)
# 原始叛乱模型
revolt_original <- coxph(survobj_revolt ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + mil + cw + age,
data = data, ties = "breslow")
# 原始自然生存对象
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)
# 原始自然模型
natural_original <- coxph(survobj_natural ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age,
data = data, ties = "breslow")
# 定义指数化系数的函数
exp_coef <- function(x) { round(exp(x), 3) }
# 使用stargazer创建表格
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)
这是输出结果。请查看模型1、3和5(忽略其他模型)。注意,虽然系数是相同的,但星号都不正常。有人知道发生了什么吗?有趣的是,当我在不对系数进行指数化的情况下呈现结果(即,当我不以风险比例的形式呈现结果时),星号是正确的。但我需要以风险比例的形式呈现结果。如果有任何建议,将不胜感激!
英文:
I am reproducing in R
some survival analysis results published in a journal. The original results were produced in Stata. Here are the original results: . Here is the code I use to reproduce these results in R
.
# load packages
library(dplyr)
library(foreign)
library(stargazer)
# load Svolik's original data
data = read_stata("leaders, institutions, covariates, updated tvc.dta")
# set a t0 for each row
data = mutate(data,t0 = lag(t,default=0), .by=leadid)
# coup survival object original
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)
# coup model original
coups_original <- coxph(survobj_coup~legislature + lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age,
data=data, ties="breslow")
# revolt survival object original
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)
# revolt model original
revolt_original <- coxph(survobj_revolt~legislature + lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ mil+ cw+ age,
data=data, ties="breslow")
# natural survival object original
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)
# natural model original
natural_original <- coxph(survobj_natural~legislature + lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age,
data=data, ties="breslow")
# Define a function to exponentiate coefficients
exp_coef <- function(x) { round(exp(x), 3) }
# Create the table using stargazer
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)
Here is the output. Look at Models 1, 3, and 5 (disregard the others). Notice that while the coefficients are the same, the stars are all wonky. Does anyone know what's going on? Interestingly, when I present the results without exponentiating the coefficients (i.e., when I don't present the results in terms of hazard ratios), the stars are correct. But I need to present the results in terms of hazard ratios. Any advice would be appreciated!
答案1
得分: 1
stargazer()
函数有一个名为p.auto
的参数,默认值为TRUE
。文档中说明:
一个逻辑值,指示如果用户提供了系数或标准误差(从参数
coef
和se
中)或者通过函数修改了它们(从参数apply.coef
或apply.se
中),stargazer是否应该计算p值,使用标准正态分布。换句话说,如果你转换了系数,默认行为是基于标准正态分布重新计算转换后的系数和未转换的标准误差的p值。换句话说:毫无意义。
要保留coxph()
计算的p值,您需要指定p.auto = FALSE
。以下是一个可重现的示例:
library(survival)
library(stargazer) |> suppressPackageStartupMessages()
fit <- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps),
#> data = ovarian)
#>
#> coef exp(coef) se(coef) z p
#> age 0.16150 1.17527 0.04992 3.235 0.00122
#> factor(ecog.ps)2 0.01866 1.01884 0.59908 0.031 0.97515
#>
#> Likelihood ratio test=14.29 on 2 df, p=0.000787
#> n= 26, number of events= 12
stargazer报告的p值指标不匹配:
stargazer(fit, apply.coef = exp, type = "text")
#>
#> ================================================
#> Dependent variable:
#> ---------------------------
#> futime
#> ------------------------------------------------
#> age 1.175***
#> (0.050)
#>
#> factor(ecog.ps)2 1.019*
#> (0.599)
#>
#> ------------------------------------------------
#> Observations 26
#> R2 0.423
#> Max. Possible R2 0.932
#> Log Likelihood -27.838
#> Wald Test 10.540*** (df = 2)
#> LR Test 14.295*** (df = 2)
#> Score (Logrank) Test 12.261*** (df = 2)
#> ================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
...除非您明确告诉它不要重新计算它们:
stargazer(fit, apply.coef = exp, type = "text", p.auto = FALSE)
#>
#> ================================================
#> Dependent variable:
#> ---------------------------
#> futime
#> ------------------------------------------------
#> age 1.175***
#> (0.050)
#>
#> factor(ecog.ps)2 1.019
#> (0.599)
#>
#> ------------------------------------------------
#> Observations 26
#> R2 0.423
#> Max. Possible R2 0.932
#> Log Likelihood -27.838
#> Wald Test 10.540*** (df = 2)
#> LR Test 14.295*** (df = 2)
#> Score (Logrank) Test 12.261*** (df = 2)
#> ================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
英文:
The stargazer()
function has an argument called p.auto
which defaults to
TRUE
. The documentation states:
> a logical value that indicates whether stargazer should calculate the p-values,
> using the standard normal distribution, if coefficients or standard errors are
> supplied by the user (from arguments coef and se) or modified by a function
> (from arguments apply.coef or apply.se).
That is, if you transform the coefficients the default behaviour is to recalculate
p-values from the transformed coefficients and untransformed standard errors
based on the standard normal distribution. In other words: nonsense.
You’ll need to specify p.auto = FALSE
to keep using the p-values calculated by coxph()
.
Here’s a reproducible example:
library(survival)
library(stargazer) |> suppressPackageStartupMessages()
fit <- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps),
#> data = ovarian)
#>
#> coef exp(coef) se(coef) z p
#> age 0.16150 1.17527 0.04992 3.235 0.00122
#> factor(ecog.ps)2 0.01866 1.01884 0.59908 0.031 0.97515
#>
#> Likelihood ratio test=14.29 on 2 df, p=0.000787
#> n= 26, number of events= 12
stargazer reported p-value indicators don’t match:
stargazer(fit, apply.coef = exp, type = "text")
#>
#> ================================================
#> Dependent variable:
#> ---------------------------
#> futime
#> ------------------------------------------------
#> age 1.175***
#> (0.050)
#>
#> factor(ecog.ps)2 1.019*
#> (0.599)
#>
#> ------------------------------------------------
#> Observations 26
#> R2 0.423
#> Max. Possible R2 0.932
#> Log Likelihood -27.838
#> Wald Test 10.540*** (df = 2)
#> LR Test 14.295*** (df = 2)
#> Score (Logrank) Test 12.261*** (df = 2)
#> ================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
… unless you specifically tell it not to recalculate them:
stargazer(fit, apply.coef = exp, type = "text", p.auto = FALSE)
#>
#> ================================================
#> Dependent variable:
#> ---------------------------
#> futime
#> ------------------------------------------------
#> age 1.175***
#> (0.050)
#>
#> factor(ecog.ps)2 1.019
#> (0.599)
#>
#> ------------------------------------------------
#> Observations 26
#> R2 0.423
#> Max. Possible R2 0.932
#> Log Likelihood -27.838
#> Wald Test 10.540*** (df = 2)
#> LR Test 14.295*** (df = 2)
#> Score (Logrank) Test 12.261*** (df = 2)
#> ================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论