英文:
How to access the current by-category in a tbl_custom_summary from gtsummary?
问题
我想创建一个定制的gtsummary表格,在这个表格中,我需要访问由“by”分组的组中的病例数量。
例如,我想计算按治疗分组的每个阶段的患者响应比例。
目前,我使用以下方法来访问治疗组:
temp_drugname = pull(data %>% select(trt) %>% distinct())
只有在治疗组中至少有一个患者具有当前阶段时,这才有效。
如果有一个阶段没有分配给任何患者,我找不到计算正确总数的方法。
在下面的示例中,我删除了组A中患有1期癌症的病例。总数应为21,而不是54,以便稍后正确计算置信区间。
library(gtsummary)
library(tidyverse)
my_ratio_summary = function (numerator, na.rm = TRUE, conf.level = 0.95)
{
function(data, full_data, ...) {
temp_drugname = pull(data %>% select(trt) %>% distinct())
if(length(temp_drugname) > 0){
druggroup_data = full_data %>% filter(trt == temp_drugname)
}
else {
druggroup_data = full_data
}
num <- sum(data[[numerator]], na.rm = na.rm)
denom <- sum(druggroup_data[[numerator]], na.rm = na.rm)
ratio <- num/denom
dplyr::tibble(num = num, denom = denom, ratio = ratio)
}
}
trial %>%
filter(!(trt == 'Drug A' & stage == 'T1')) %>%
tbl_custom_summary(
include = c("stage"),
by = "trt",
stat_fns = ~ my_ratio_summary("response"),
statistic = ~ "{ratio}%, {num}/{denom}",
digits = ~ c(style_percent, 0, 0)
)
<details>
<summary>英文:</summary>
I would like to create a customized gtsummary table, where I have to access the number of cases in that group resulting by `by`.
For example, I would like to calculate the proportion of patients with a response per stage grouped by treatment.
Currently I use this workaround, to access the treatment group:
temp_drugname = pull(data%>%select(trt)%>%distinct())
This works only, if there is at least one patient with the current stage in the treatment group.
If there is a stage without patients assigned to it, I can't find a way to calculate the correct total.
In the example below, I deleted the cases in group A with stage 1 cancer. The total should be 21, not 54, in order to later-on calculate confidence intervals correctly.
[![example with wrong total][1]][1]
library(gtsummary)
library(tidyverse)
my_ratio_summary = function (numerator, na.rm = TRUE, conf.level = 0.95)
{
function(data, full_data, ...) {
temp_drugname = pull(data%>%select(trt)%>%distinct())
if(length(temp_drugname) > 0){
druggroup_data = full_data%>%filter(trt == temp_drugname)
}
else {
druggroup_data = full_data
}
num <- sum(data[[numerator]], na.rm = na.rm)
denom <- sum(druggroup_data[[numerator]], na.rm = na.rm)
ratio <- num/denom
dplyr::tibble(num = num, denom = denom, ratio = ratio)
}
}
trial %>%
filter(!(trt == 'Drug A' & stage == 'T1'))%>%
tbl_custom_summary(
include = c("stage"),
by = "trt",
stat_fns = ~ my_ratio_summary("response"),
statistic = ~"{ratio}%, {num}/{denom}",
digits = ~ c(style_percent, 0, 0)
)
[1]: https://i.stack.imgur.com/EUcHX.jpg
</details>
# 答案1
**得分**: 0
一个选择是使用通过`...`传递给自定义函数的附加参数,例如第一个元素是一个包含有关当前分组信息的`tibble`,可以用于筛选`full_data`,我使用了`semi_join`。即使`data`不包含任何观察值,这也可以正常工作。
在下面的代码中,我添加了一个`print`语句来显示`...`的内容:
```r
library(gtsummary)
library(tidyverse)
my_ratio_summary <- function(numerator, na.rm = TRUE, conf.level = 0.95) {
function(data, full_data, ...) {
dots <- list(...)
# 仅添加以打印`dots`的内容
print(dots)
druggroup_data <- full_data %>% semi_join(dots[[1]], by = dots$by)
num <- sum(data[[numerator]], na.rm = na.rm)
denom <- sum(druggroup_data[[numerator]], na.rm = na.rm)
ratio <- num / denom
dplyr::tibble(num = num, denom = denom, ratio = ratio)
}
}
trial %>%
filter(!(trt == "Drug A" & stage == "T1")) %>%
tbl_custom_summary(
include = c("stage"),
by = "trt",
stat_fns = ~ my_ratio_summary("response"),
statistic = ~"{ratio}%, {num}/{denom}",
digits = ~ c(style_percent, 0, 0)
)
[1]: https://i.stack.imgur.com/yzP0t.png
<details>
<summary>英文:</summary>
One option would be to use the additional arguments passed to the custom function via `...`, e.g. the first element is a `tibble` which contains the info about the current groups which can be used to filter `full_data` for which I use a `semi_join`. This works even if `data` does not contain any observations.
In the code below I added a `print` statement to show the content of `...`:
``` r
library(gtsummary)
library(tidyverse)
my_ratio_summary <- function(numerator, na.rm = TRUE, conf.level = 0.95) {
function(data, full_data, ...) {
dots <- list(...)
# Only added to print the content of `dots`
print(dots)
druggroup_data <- full_data %>% semi_join(dots[[1]], by = dots$by)
num <- sum(data[[numerator]], na.rm = na.rm)
denom <- sum(druggroup_data[[numerator]], na.rm = na.rm)
ratio <- num / denom
dplyr::tibble(num = num, denom = denom, ratio = ratio)
}
}
trial %>%
filter(!(trt == "Drug A" & stage == "T1")) %>%
tbl_custom_summary(
include = c("stage"),
by = "trt",
stat_fns = ~ my_ratio_summary("response"),
statistic = ~"{ratio}%, {num}/{denom}",
digits = ~ c(style_percent, 0, 0)
)
#> [[1]]
#> # A tibble: 1 × 2
#> trt stage
#> <chr> <fct>
#> 1 Drug A T1
#>
#> $variable
#> [1] "stage"
#>
#> $by
#> [1] "trt"
#>
#> $type
#> [1] "categorical"
#>
#> $stat_display
#> [1] "{ratio}%, {num}/{denom}"
...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论