Using package::function inside formula in R gives different results (specifically survival::strata inside coxph)

huangapple go评论146阅读模式
英文:

Using package::function inside formula in R gives different results (specifically survival::strata inside coxph)

问题

I believed the following two formula calls should be equivalent.

install.packages("survival")
library(survival)

Model 1

coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian)

Call:
coxph(formula = Surv(futime, fustat) ~ age + strata(rx), data = ovarian)

       coef exp(coef) se(coef)     z       p
age 0.13735   1.14723  0.04741 2.897 0.00376

Likelihood ratio test=12.69  on 1 df, p=0.0003678
n= 26, number of events= 12 

Model 2

coxph(Surv(futime, fustat) ~ age + survival::strata(rx), data=ovarian)

Call:
coxph(formula = Surv(futime, fustat) ~ age + survival::strata(rx), 
    data = ovarian)

                             coef exp(coef) se(coef)      z       p
age                       0.14733   1.15873  0.04615  3.193 0.00141
survival::strata(rx)rx=2 -0.80397   0.44755  0.63205 -1.272 0.20337

Likelihood ratio test=15.89  on 2 df, p=0.0003551
n= 26, number of events= 12 

However, they are not. The second model, which references 'strata' specifically using the "::" operator, treats it like a fixed effect variable in the model, rather than fitting the model separately within levels of 'strata'. I do not know if this is:

A) An issue with using 'package::function' type calls within a formula in R, or
B) An issue with the survival package and namespaces/dependencies

I want to use the call to Model 2, as I am building a package and don't want to have to attach the entire survival package (which is large) when loading mine.

Any help on this would be much appreciated.

英文:

I believed the following two formula calls should be equivalent.

install.packages("survival")
library(survival)

Model 1

coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian)

Call:
coxph(formula = Surv(futime, fustat) ~ age + strata(rx), data = ovarian)

       coef exp(coef) se(coef)     z       p
age 0.13735   1.14723  0.04741 2.897 0.00376

Likelihood ratio test=12.69  on 1 df, p=0.0003678
n= 26, number of events= 12 

Model 2

coxph(Surv(futime, fustat) ~ age + survival::strata(rx), data=ovarian)

Call:
coxph(formula = Surv(futime, fustat) ~ age + survival::strata(rx), 
    data = ovarian)

                             coef exp(coef) se(coef)      z       p
age                       0.14733   1.15873  0.04615  3.193 0.00141
survival::strata(rx)rx=2 -0.80397   0.44755  0.63205 -1.272 0.20337

Likelihood ratio test=15.89  on 2 df, p=0.0003551
n= 26, number of events= 12 

However they are not. The second model, which references 'strata' specifically using the "::" operator, treats it like a fixed effect variable in the model, rather than fitting the model separately within levels of 'strata'. I do not know if this is:

A) An issue with using 'package::function' type calls within a formula in R, or
B) An issue with the survival package and namespaces/dependencies

I want to use the call to Model 2, as I am building a package and don't want to have to attach the entire survival package (which is large) when loading mine.

Any help on this would be much appreciated.

答案1

得分: 3

In coxph(),这里有这些行:

special <- c("strata", "tt", "frailty", "ridge", "pspline")
tform$formula <- if (missing(data)) 
  terms(formula, special)
else terms(formula, special, data = data)

因此,在第一种情况下,coxph 正确检测到了 strata,但在第二种情况下没有。我们可以使用您的示例来验证这一点:

library(survival)

formula1 <- Surv(futime, fustat) ~ age + strata(rx)
formula2 <- Surv(futime, fustat) ~ age + survival::strata(rx)
special <- c("strata", "tt", "frailty", "ridge", "pspline")
data <- ovarian

t1 <- terms(formula1, special, data = data)
t2 <- terms(formula2, special, data = data)

attr(t1, "specials")$strata
#> [1] 3
attr(t2, "specials")$strata
#> NULL

虽然我不了解 coxph() 的具体细节,但我会说这种差异后来导致了结果的不同。

因此,您应该避免在公式中使用前缀 survival::。如果由于某种原因,您必须使用这个前缀,可以使用以下技巧:

library(survival)

# 在这里
strata <- survival::strata
coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian)
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ age + strata(rx), data = ovarian)
#> 
#>        coef exp(coef) se(coef)     z       p
#> age 0.13735   1.14723  0.04741 2.897 0.00376
#> 
#> Likelihood ratio test=12.69  on 1 df, p=0.0003678
#> n= 26, number of events= 12
英文:

In coxph(), there are these lines:

special <- c("strata", "tt", "frailty", "ridge", "pspline")
tform$formula <- if (missing(data)) 
  terms(formula, special)
else terms(formula, special, data = data)

Therefore, in the first case, coxph correctly detects the strata but not in the second case. We can see this using your example:

library(survival)

formula1 <- Surv(futime, fustat) ~ age + strata(rx)
formula2 <- Surv(futime, fustat) ~ age + survival::strata(rx)
special <- c("strata", "tt", "frailty", "ridge", "pspline")
data <- ovarian

t1 <- terms(formula1, special, data = data)
t2 <- terms(formula2, special, data = data)

attr(t1, "specials")$strata
#> [1] 3
attr(t2, "specials")$strata
#> NULL

While I don't know the specifics of coxph(), I would say that this difference later leads to the differences in results.


Therefore, you should avoid using the prefix survival:: in the formula. If for some reason, you have to use this prefix, you can use the following trick:

library(survival)

# here
strata <- survival::strata
coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian)
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ age + strata(rx), data = ovarian)
#> 
#>        coef exp(coef) se(coef)     z       p
#> age 0.13735   1.14723  0.04741 2.897 0.00376
#> 
#> Likelihood ratio test=12.69  on 1 df, p=0.0003678
#> n= 26, number of events= 12

huangapple
  • 本文由 发表于 2023年5月11日 17:01:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76225865.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定