英文:
How do I change confidence interval calculation to log-log on R?
问题
如果我想使用coxph和confinf函数来计算置信区间,如何将置信区间计算更改为对数-对数(log-log)?我的理解是默认情况下是对数。
我尝试了conf.type="log-log"
,但它不起作用,只是出现了错误消息。
library(survival)
coxph(formula = Surv(futime, fustat) ~ tx, data = tki, conf.type="log-log")
fit <- coxph(formula = Surv(futime, fustat) ~ tx, data = tki)
summary(fit)
# 输出提供了风险比(HR)的置信区间
confint(fit)
# 系数的置信区间
exp(confint(fit))
结构化数据 tki
的示例:
structure(list(futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3, 8.11, 7.29, 6.75, 6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78, 2.65, 1.72, 3.63),
fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1),
tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
英文:
If I wanted to calculate confidence intervals using the coxph and confinf functions, how do I change the confidence interval calculation to log-log? My understanding is that log is the default.
I tried conf.type="log-log"
but it did not work, just got an error message
library(survival)
coxph(formula = Surv(futime, fustat) ~ tx, data = tki, conf.type="log-log")
fit <- coxph(formula = Surv(futime, fustat) ~ tx, data = tki)
summary(fit)
#output provides HR CIs
confint(fit)
#coefficient CIs
exp(confint(fit))
> dput(tki) structure(list(futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3, 8.11, 7.29, 6.75, 6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78, 2.65, 1.72, 3.63), fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1), tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
答案1
得分: 1
以下是您要翻译的内容:
"The log(-log()) method of producing confidence intervals applies to the estimates of survival probabilities (or related transformations). We need to use the survfit()
function to create those, confint()
will not work. The default method is log
as you note. Supplying conf.type = "log-log"
changes the method as desired.
Note however, that the use of survfit()
with a coxph()
model almost always requires you to correctly specify the newdata =
option. See the details in ?survfit.coxph
. Ignoring this will usually still give you some result, but it will be wrong in subtle ways.
library(survival)
tki = structure(list(
futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3, 8.11, 7.29, 6.75,
6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78, 2.65, 1.72, 3.63),
fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1),
tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
coxmod <- coxph(Surv(futime, fustat) ~ tx, data = tki)
summary(coxmod)
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ tx, data = tki)
#>
#> n= 20, number of events= 11
#>
#> coef exp(coef) se(coef) z Pr(>|z|)
#> tx 2.2783 9.7599 0.8335 2.733 0.00627 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> exp(coef) exp(-coef) lower .95 upper .95
#> tx 9.76 0.1025 1.905 50
#>
#> Concordance= 0.744 (se = 0.042 )
#> Likelihood ratio test= 9.41 on 1 df, p=0.002
#> Wald test = 7.47 on 1 df, p=0.006
#> Score (logrank) test = 10.34 on 1 df, p=0.001
confint(coxmod) #confint provides confidence interval for coefficient estimate
#> 2.5 % 97.5 %
#> tx 0.644588 3.911983
exp(confint(coxmod)) #confidence interval for hazard ratio
#> 2.5 % 97.5 %
#> tx 1.905202 49.99797
# We need to define 'newdata' to use in survfit, include all covariates.
# If you don't use newdata, the results are almost certainly not what you want.
newdat = data.frame(tx = c(1, 2))
# Fit model for survival function using survfit. Set confidence interval type
survmod <- survfit(coxmod, newdata = newdat, conf.type = "log-log")
# Upper and lower confidence intervals for survival probability, 1 column
# for each row of newdata.
survmod$upper
#> 1 2
#> [1,] 0.9992040 0.9873779
#> [2,] 0.9974256 0.9539678
#> [3,] 0.9974256 0.9539678
#> ...
survmod$lower
#> 1 2
#> [1,] 0.8972081 5.232360e-01
#> [2,] 0.8626622 4.631231e-01
#> [3,] 0.8626622 4.631231e-01
#> ...
plot(survmod, conf.int = T, col = c("red","blue"))
创建于2023-02-21,使用reprex v2.0.2
回复评论:
有没有简单生成使用对数-对数法计算的危险比和95%置信区间的方法?
我认为您对如何计算置信区间以及不同方法的估计存在一些误解。
默认情况下,coxph()
使用信息矩阵的倒数来估计参数的方差。另一种方法是在 coxph()
调用中输入 robust = TRUE
以使用“稳健”的方差估计方法(稳健估计使用了一个无限小的jackknife方法,详见包文档vignette的2.7节,https://cran.r-project.org/web/packages/survival/vignettes/survival.pdf)。然后,confint()
函数使用估计的方差构建参数估计的“wald-type”置信区间。最后,将参数估计的置信区间的exp()
值给出了危险比的置信区间。(问题中包含的代码标签中似乎有错误。但是,您可以看到结果与 summary(fit)
匹配。)
在上述计算的任何阶段都没有使用或提供了对数或对数-对数法的置信区间。如果尝试在 coxph()
调用中包含 conf.type =
选项,则会导致错误并
英文:
The log(-log()) method of producing confidence intervals applies to the estimates of survival probabilities (or related transformations). We need to use the survfit()
function to create those, confint()
will not work. The default method is log
as you note. Supplying conf.type = "log-log"
changes the method as desired.
Note however, that the use of survfit()
with a coxph()
model almost always requires you to correctly specify the newdata =
option. See the details in ?survfit.coxph
. Ignoring this will usually still give you some result, but it will be wrong in subtle ways.
library(survival)
tki = structure(list(
futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3, 8.11, 7.29, 6.75,
6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78, 2.65, 1.72, 3.63),
fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1),
tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
coxmod <- coxph(Surv(futime, fustat) ~ tx, data = tki)
summary(coxmod)
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ tx, data = tki)
#>
#> n= 20, number of events= 11
#>
#> coef exp(coef) se(coef) z Pr(>|z|)
#> tx 2.2783 9.7599 0.8335 2.733 0.00627 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> exp(coef) exp(-coef) lower .95 upper .95
#> tx 9.76 0.1025 1.905 50
#>
#> Concordance= 0.744 (se = 0.042 )
#> Likelihood ratio test= 9.41 on 1 df, p=0.002
#> Wald test = 7.47 on 1 df, p=0.006
#> Score (logrank) test = 10.34 on 1 df, p=0.001
confint(coxmod) #confint provides confidence interval for coefficient estimate
#> 2.5 % 97.5 %
#> tx 0.644588 3.911983
exp(confint(coxmod)) #confidence interval for hazard ratio
#> 2.5 % 97.5 %
#> tx 1.905202 49.99797
# We need to define 'newdata' to use in survfit, include all covariates.
# If you don't use newdata, the results are almost certainly not what you want.
newdat = data.frame(tx = c(1, 2))
# Fit model for survival function using survfit. Set confidence interval type
survmod <- survfit(coxmod, newdata = newdat, conf.type = "log-log")
# Upper and lower confidence intervals for survival probability, 1 column
# for each row of newdata.
survmod$upper
#> 1 2
#> [1,] 0.9992040 0.9873779
#> [2,] 0.9974256 0.9539678
#> [3,] 0.9974256 0.9539678
#> ...
survmod$lower
#> 1 2
#> [1,] 0.8972081 5.232360e-01
#> [2,] 0.8626622 4.631231e-01
#> [3,] 0.8626622 4.631231e-01
#> ...
plot(survmod, conf.int = T, col = c("red","blue"))
<sup>Created on 2023-02-21 with reprex v2.0.2</sup>
Reply to comment:
> is there a way to simply generate a hazard ratio and a 95% confidence interval calculated with the log-log method?
I think you have misunderstood something about how confidence intervals are calculated and what the different methods are estimating.
By default, coxph()
estimates the variance of the parameters using the inverse of the information matrix. An alternative is to use a "robust" variance estimate by entering robust = TRUE
in the coxph()
call. (The robust estimate uses an infintisimal jackknife, see 2.7 of the package documentation vignette, https://cran.r-project.org/web/packages/survival/vignettes/survival.pdf .) The confint()
function then uses the estimated variance to construct a "wald-type" confidence interval for the parameter estimates. Finally, taking the exp()
of the confidence interval for the parameter estimate gives a confidence interval for the hazard ratio. (The code included in the question seems to have a mistake in labeling the confidence intervals. But, you can see the result matches summary(fit)
.)
At no point in the above calculations was a log or a log-log confidence interval used or available. If you try to include a conf.type =
option in the coxph()
call then it gives an error and refuses to run.
The log method of calculating confidence intervals is the default for the survfit()
function. By specifying conf.type =
option in the survfit()
function you can change the method used from "log" to any of the supported options: "log", "log-log", "plain", "none", "logit", or "arcsin". survfit()
uses these methods to calculate a confidence interval for the survival probability. survfit()
does not estimate the hazard ratio, it bases the estimate of the survival probability off of the hazard ratio that was estimated by coxph()
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论