英文:
Using margins() with geeglm() in R
问题
我已经创建了一系列使用geeglm()函数的GEE模型。我想在exposure_1为正和负时计算主要结果的边际几率比。
我想绘制总体几率和两个边际几率比的森林图,以便进行比较。然而,margins()不接受geeglm()模型。
我一直没有找到另一种在margins中使用geeglm()输出的方法。我将不胜感激任何帮助。
以下是一个创建geeglm模型并将其转化为森林图的代码的通用版本:
library(ggplot2)
library(dplyr)
# 为了可重现性,设置随机种子
set.seed(13)
# 数据框中的行数
n <- 100
# 生成暴露和结果的虚构数据
exposure_1 <- sample(0:1, n, replace = TRUE)
exposure_2 <- sample(0:1, n, replace = TRUE)
exposure_3 <- sample(0:1, n, replace = TRUE)
# 生成主要结果的虚构数据(假设是二元结果)
# 如果需要,可以基于某种逻辑或模型生成结果
primary_outcome <- rbinom(n, 1, 0.3 + 0.2 * exposure_1 + 0.1 * exposure_2 - 0.1 * exposure_3)
# 生成社区/分组变量的虚构数据
enr_community <- sample(letters[1:5], n, replace = TRUE)
# 合并生成的数据以创建数据框
df <- data.frame(
exposure_1 = exposure_1,
exposure_2 = exposure_2,
exposure_3 = exposure_3,
primary_outcome = primary_outcome,
enr_community = enr_community
)
logit_model_1 <- geeglm(formula = primary_outcome ~ exposure_1*exposure_2 + exposure_3 , family = binomial, id = enr_community, corstr = "independence", data = df)
lcf <- logit_model_1 %>%
tidy(exponentiate = TRUE, conf.int = TRUE, conf.level = 0.95) %>%
mutate(model = "Main Result") %>%
mutate(exposure = "Exposure 1") %>%
filter(term == "exposure_1")
lcftable <- lcf |>
# 对估计值和95%置信区间四舍五入到小数点后2位,以满足期刊要求
mutate(across(
c(estimate, conf.low, conf.high, p.value),
~ str_pad(
round(.x,2),
width = 4,
pad = "0",
side = "right"
))) %>%
mutate(estimate = case_when(
estimate == "1000" ~ "1.00",
TRUE ~ estimate
)) %>%
mutate(conf.low = case_when(
conf.low == "1000" ~ "1.00",
TRUE ~ conf.low
)) %>%
mutate(conf.high = case_when(
conf.high == "2000" ~ "2.00",
TRUE ~ conf.high
)) %>%
mutate(model = fct_rev(fct_relevel(model, "Model")))
lcftable <- lcftable %>%
# 转换回字符
mutate(estimate = format(estimate, n.small = 3),
conf.low = format(conf.low, n.small = 3),
conf.high = format(conf.high, n.small = 3),
estimate_lab = paste0(estimate, " (", as.character(conf.low), " -", conf.high, ")")) |>
mutate(p.value = as.numeric(p.value)) %>%
# 四舍五入p值到两位小数,除非p < 0.001
mutate(p.value = case_when(
p.value < .001 ~ "<0.001",
round(p.value, 2) == .05 ~ as.character(round(p.value,3)),
p.value < .01 ~ str_pad( # 如果小于0.01,则再多保留一位小数
as.character(round(p.value, 3)),
width = 4,
pad = "0",
side = "right"
),
p.value > 0.995 ~ "1.00",
TRUE ~ str_pad( # 否则只四舍五入到2位小数,并填充字符串以使0.20读作0.20
as.character(round(p.value, 2)),
width = 4,
pad = "0",
side = "right"
))) %>%
mutate(estimate = as.numeric(estimate),
conf.low = as.numeric(conf.low),
conf.high = as.numeric(conf.high))
lcfplot1 <- lcftable %>%
ggplot(mapping = aes(x = estimate, xmin = conf.low, xmax = conf.high, y = model)) + #y = model rather than term
geom_pointrange() +
# scale_x_continuous(breaks = c(0, 1.0, 1.5, 2.0)) + #trans = scales::pseudo_log_trans(base = 10)
geom_vline(xintercept = 1) +
geom_point(size = 1.0) +
# 调整分面网格
facet_wrap(exposure~., ncol = 1, scale = "free_y") +
xlab("Incident Odds Ratio, IOR (95% Confidence Interval)") +
ggtitle("Associations between exposures and outcome") +
geom_text(aes(x = 3.5, label = estimate_lab), hjust = -0.5, size = 3) +
geom_text(aes(x = 2.5, label = p.value), hjust = -3.5, size = 3) +
coord_cartesian(xlim = c(-0.1, 7.0)) +
scale_x_continuous(trans = "pseudo_log", breaks = c(-1.0, 0, 1.0, 2.0, 5.0)) +
theme(panel.background = element_blank(),
panel.spacing = unit(1, "lines"),
panel.border = element_rect(fill = NA, color = "white"),
strip.background = element_rect(colour="white", fill="white"),
strip.placement = "outside",
text = element_text(size = 12),
strip.text = element_text(face = "bold.italic", size = 11, hjust = 0.33), #element_blank(),
plot.title = element_text(face = "bold", size = 13, hjust =
<details>
<summary>英文:</summary>
I have created a series of GEE models using geeglm(). I would like to calculate the marginal odds ratios for primary_outcome when exposure_1 is positive and negative.
I would like to plot the overall odds and the two marginal odds ratios sequentially in a forest plot to allow for comparison. However, margins() does not accept a geeglm() model.
I have not been able to figure out another way to use geeglm() output with margins. I would appreciate any help.
Here is a generic version of the code which creates a geeglm model and shows how I am translating it into a forest plot
library(ggplot2)
library(dplyr)
# Set a random seed for reproducibility
set.seed(13)
# Number of rows in the dataframe
n <- 100
# Generate made-up data for exposures and outcome
exposure_1 <- sample(0:1, n, replace = TRUE)
exposure_2 <- sample(0:1, n, replace = TRUE)
exposure_3 <- sample(0:1, n, replace = TRUE)
# Generate made-up data for the primary outcome (assuming a binary outcome)
# The outcome could be generated based on some logic or model if needed
primary_outcome <- rbinom(n, 1, 0.3 + 0.2 * exposure_1 + 0.1 * exposure_2 - 0.1 * exposure_3)
# Generate made-up data for the community/grouping variable
enr_community <- sample(letters[1:5], n, replace = TRUE)
# Combine the generated data to create the dataframe
df <- data.frame(
exposure_1 = exposure_1,
exposure_2 = exposure_2,
exposure_3 = exposure_3,
primary_outcome = primary_outcome,
enr_community = enr_community
)
logit_model_1 <- geeglm(formula = primary_outcome ~ exposure_1*exposure_2 + exposure_3 , family = binomial, id = enr_community, corstr = "independence", data = df)
lcf <- logit_model_1 %>%
tidy(exponentiate = TRUE, conf.int = TRUE, conf.level = 0.95) %>%
mutate(model = "Main Result") %>%
mutate(exposure = "Exposure 1") %>%
filter(term == "exposure_1")
lcftable <- lcf |>
# round estimates and 95% CIs to 2 decimal places for journal specifications
mutate(across(
c(estimate, conf.low, conf.high, p.value),
~ str_pad(
round(.x,2),
width = 4,
pad = "0",
side = "right"
))) %>%
mutate(estimate = case_when(
estimate == "1000" ~ "1.00",
TRUE ~ estimate
)) %>%
mutate(conf.low = case_when(
conf.low == "1000" ~ "1.00",
TRUE ~ conf.low
)) %>%
mutate(conf.high = case_when(
conf.high == "2000" ~ "2.00",
TRUE ~ conf.high
)) %>%
mutate(model = fct_rev(fct_relevel(model, "Model")))
lcftable <- lcftable %>%
#convert back to characters
mutate(estimate = format(estimate, n.small = 3),
conf.low = format(conf.low, n.small = 3),
conf.high = format(conf.high, n.small = 3),
estimate_lab = paste0(estimate, " (", as.character(conf.low), " -", conf.high, ")")) |>
mutate(p.value = as.numeric(p.value)) %>%
# round p-values to two decimal places, except in cases where p < .001
mutate(p.value = case_when(
p.value < .001 ~ "<0.001",
round(p.value, 2) == .05 ~ as.character(round(p.value,3)),
p.value < .01 ~ str_pad( # if less than .01, go one more decimal place
as.character(round(p.value, 3)),
width = 4,
pad = "0",
side = "right"
),
p.value > 0.995 ~ "1.00",
TRUE ~ str_pad( # otherwise just round to 2 decimal places and pad string so that .2 reads as 0.20
as.character(round(p.value, 2)),
width = 4,
pad = "0",
side = "right"
))) %>%
mutate(estimate = as.numeric(estimate),
conf.low = as.numeric(conf.low),
conf.high = as.numeric(conf.high))
lcfplot1 <- lcftable %>%
ggplot(mapping = aes(x = estimate, xmin = conf.low, xmax = conf.high, y = model)) + #y = model rather than term
geom_pointrange() +
# scale_x_continuous(breaks = c(0, 1.0, 1.5, 2.0)) + #trans = scales::pseudo_log_trans(base = 10)
geom_vline(xintercept = 1) +
geom_point(size = 1.0) +
#adjust facet grid
facet_wrap(exposure~., ncol = 1, scale = "free_y") +
xlab("Incident Odds Ratio, IOR (95% Confidence Interval)") +
ggtitle("Associations between exposures and outcome") +
geom_text(aes(x = 3.5, label = estimate_lab), hjust = -0.5, size = 3) +
geom_text(aes(x = 2.5, label = p.value), hjust = -3.5, size = 3) +
coord_cartesian(xlim = c(-0.1, 7.0)) +
scale_x_continuous(trans = "pseudo_log", breaks = c(-1.0, 0, 1.0, 2.0, 5.0)) +
theme(panel.background = element_blank(),
panel.spacing = unit(1, "lines"),
panel.border = element_rect(fill = NA, color = "white"),
strip.background = element_rect(colour="white", fill="white"),
strip.placement = "outside",
text = element_text(size = 12),
strip.text = element_text(face = "bold.italic", size = 11, hjust = 0.33), #element_blank(),
plot.title = element_text(face = "bold", size = 13, hjust = -11, vjust = 4),
axis.title.x = element_text(size = 10, vjust = -1, hjust = 0.05),
axis.title.y = element_blank(),
panel.grid.major.x = element_line(color = "#D3D3D3",
size = 0.3, linetype = 2),
plot.margin = unit(c(1, 3, 0.5, 0.5), "inches")) +
geom_errorbarh(height = 0)
</details>
# 答案1
**得分**: 2
`margins`软件包已经有一段时间没有更新,基本上处于维护模式。我建议转向更新的、通常更好的`marginaleffects`软件包。Vincent编写了一本出色的指南书:https://vincentarelbundock.github.io/marginaleffects/
我没有尝试过,但我可以看到`geeglm`模型是受支持的对象类别。
<details>
<summary>英文:</summary>
The `margins` package has not been updated for some time and is basically in maintenance mode. I would suggest moving over to the newer and generally much better `marginaleffects` package. Vincent has put together a phenomenal guidebook: https://vincentarelbundock.github.io/marginaleffects/
I have not tried it, but I can see that geeglm models are a supported object class.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论