2023年6月9日 04:31:01go评论101阅读模式

英文:

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

问题

我正在R中复现一些刊物上发布的生存分析结果。原始结果是在Stata中生成的。以下是原始结果：。以下是我在R中用于复现这些结果的代码。

# 载入所需包
library(dplyr)
library(foreign)
library(stargazer)
# 载入Svolik的原始数据
data = read_stata("leaders, institutions, covariates, updated tvc.dta")
# 为每行设置t0
data = mutate(data, t0 = lag(t, default = 0), .by = leadid)
# 原始政变生存对象
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)
# 原始政变模型
coups_original <- coxph(survobj_coup ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age, 
                        data = data, ties = "breslow")
# 原始叛乱生存对象
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)
# 原始叛乱模型
revolt_original <- coxph(survobj_revolt ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + mil + cw + age, 
                          data = data, ties = "breslow")
# 原始自然生存对象
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)
# 原始自然模型
natural_original <- coxph(survobj_natural ~ legislature + lgdp_1 + growth_1 + exportersoffuelsmainlyoil_EL2008 + ethfrac_FIXED + communist + mil + cw + age, 
                          data = data, ties = "breslow")
# 定义指数化系数的函数
exp_coef <- function(x) { round(exp(x), 3) }
# 使用stargazer创建表格
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)

这是输出结果。请查看模型1、3和5（忽略其他模型）。注意，虽然系数是相同的，但星号都不正常。有人知道发生了什么吗？有趣的是，当我在不对系数进行指数化的情况下呈现结果（即，当我不以风险比例的形式呈现结果时），星号是正确的。但我需要以风险比例的形式呈现结果。如果有任何建议，将不胜感激！

英文:

I am reproducing in R some survival analysis results published in a journal. The original results were produced in Stata. Here are the original results: . Here is the code I use to reproduce these results in R.

# load packages
library(dplyr)
library(foreign)
library(stargazer)
# load Svolik&#39;s original data 
data = read_stata(&quot;leaders, institutions, covariates, updated tvc.dta&quot;)
# set a t0 for each row
data = mutate(data,t0 = lag(t,default=0), .by=leadid)
# coup survival object original
survobj_coup = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_coup)
# coup model original
coups_original &lt;- coxph(survobj_coup~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
      data=data, ties=&quot;breslow&quot;)
# revolt survival object original 
survobj_revolt = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_revolt)
# revolt model original 
revolt_original &lt;- coxph(survobj_revolt~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ mil+ cw+ age, 
                        data=data, ties=&quot;breslow&quot;)
# natural survival object original
survobj_natural = Surv(data[[&quot;t0&quot;]], data[[&quot;_t&quot;]], data$c_natural)
# natural model original
natural_original &lt;- coxph(survobj_natural~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
                        data=data, ties=&quot;breslow&quot;)
# Define a function to exponentiate coefficients
exp_coef &lt;- function(x) { round(exp(x), 3) }
# Create the table using stargazer
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef)

Here is the output. Look at Models 1, 3, and 5 (disregard the others). Notice that while the coefficients are the same, the stars are all wonky. Does anyone know what's going on? Interestingly, when I present the results without exponentiating the coefficients (i.e., when I don't present the results in terms of hazard ratios), the stars are correct. But I need to present the results in terms of hazard ratios. Any advice would be appreciated!

答案1

得分: 1

stargazer()函数有一个名为p.auto的参数，默认值为TRUE。文档中说明：

一个逻辑值，指示如果用户提供了系数或标准误差（从参数coef和se中）或者通过函数修改了它们（从参数apply.coef或apply.se中），stargazer是否应该计算p值，使用标准正态分布。换句话说，如果你转换了系数，默认行为是基于标准正态分布重新计算转换后的系数和未转换的标准误差的p值。换句话说：毫无意义。

要保留coxph()计算的p值，您需要指定p.auto = FALSE。以下是一个可重现的示例：

library(survival)
library(stargazer) |&gt; suppressPackageStartupMessages()
fit &lt;- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#&gt; Call:
#&gt; coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps), 
#&gt;     data = ovarian)
#&gt; 
#&gt;                     coef exp(coef) se(coef)     z       p
#&gt; age              0.16150   1.17527  0.04992 3.235 0.00122
#&gt; factor(ecog.ps)2 0.01866   1.01884  0.59908 0.031 0.97515
#&gt; 
#&gt; Likelihood ratio test=14.29  on 2 df, p=0.000787
#&gt; n= 26, number of events= 12

stargazer报告的p值指标不匹配：

stargazer(fit, apply.coef = exp, type = &quot;text&quot;)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2               1.019*           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

...除非您明确告诉它不要重新计算它们：

stargazer(fit, apply.coef = exp, type = &quot;text&quot;, p.auto = FALSE)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2                1.019           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

英文:

The stargazer() function has an argument called p.auto which defaults to
TRUE. The documentation states:

> a logical value that indicates whether stargazer should calculate the p-values,
> using the standard normal distribution, if coefficients or standard errors are
> supplied by the user (from arguments coef and se) or modified by a function
> (from arguments apply.coef or apply.se).

That is, if you transform the coefficients the default behaviour is to recalculate
p-values from the transformed coefficients and untransformed standard errors
based on the standard normal distribution. In other words: nonsense.

You’ll need to specify p.auto = FALSE to keep using the p-values calculated by coxph().
Here’s a reproducible example:

library(survival)
library(stargazer) |&gt; suppressPackageStartupMessages()
fit &lt;- coxph(Surv(futime, fustat) ~ age + factor(ecog.ps), data = ovarian)
fit
#&gt; Call:
#&gt; coxph(formula = Surv(futime, fustat) ~ age + factor(ecog.ps), 
#&gt;     data = ovarian)
#&gt; 
#&gt;                     coef exp(coef) se(coef)     z       p
#&gt; age              0.16150   1.17527  0.04992 3.235 0.00122
#&gt; factor(ecog.ps)2 0.01866   1.01884  0.59908 0.031 0.97515
#&gt; 
#&gt; Likelihood ratio test=14.29  on 2 df, p=0.000787
#&gt; n= 26, number of events= 12

stargazer reported p-value indicators don’t match:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2               1.019*           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

… unless you specifically tell it not to recalculate them:

stargazer(fit, apply.coef = exp, type = &quot;text&quot;, p.auto = FALSE)
#&gt; 
#&gt; ================================================
#&gt;                          Dependent variable:    
#&gt;                      ---------------------------
#&gt;                                futime           
#&gt; ------------------------------------------------
#&gt; age                           1.175***          
#&gt;                                (0.050)          
#&gt;                                                 
#&gt; factor(ecog.ps)2                1.019           
#&gt;                                (0.599)          
#&gt;                                                 
#&gt; ------------------------------------------------
#&gt; Observations                     26             
#&gt; R2                              0.423           
#&gt; Max. Possible R2                0.932           
#&gt; Log Likelihood                 -27.838          
#&gt; Wald Test                10.540*** (df = 2)     
#&gt; LR Test                  14.295*** (df = 2)     
#&gt; Score (Logrank) Test     12.261*** (df = 2)     
#&gt; ================================================
#&gt; Note:                *p&lt;0.1; **p&lt;0.05; ***p&lt;0.01

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

问题

答案1

统计重复测量中值的发生次数

在R中，如何根据条件将特定行中的值替换为另一行中的值？

在R中创建条件列时出错。

提取多列中的第一个非NA值。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。