如何从gtsummary表中隐藏特定单元格的信息

huangapple go评论64阅读模式
英文:

How to mask certain cell's info from a gtsummary table

问题

我有一些敏感信息在我的数据中,我需要将其隐藏在某个特定的阈值以下(以符合数据使用协议并防止重新识别数据)。我正在使用gtsummary中的tbl_svysmmary()。在我的示例中,我想要过滤"cell size ≤ 100":

library(gtsummary)
library(survey)

tbl_svysummary <-
  svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>%
  tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age))

tbl_svysummary

如何从gtsummary表中隐藏特定单元格的信息


我想显示儿童信息如下:

如何从gtsummary表中隐藏特定单元格的信息


编辑:用于多个stat_列和0/1变量的可重现示例:

library(gtsummary)
library(survey)

supp_outcomes <-
  as.data.frame(Titanic) %>%
  mutate(Female = ifelse(Sex == "Female", 1, 0)) %>%
  svydesign(~1, data = ., weights = ~Freq) %>%
  tbl_svysummary(by = Survived, percent = "row", 
                 include = c(Age, Female, Class)) %>%
  add_overall() %>%
  add_p()

supp_outcomes

从@Marco建议的解决方案中编辑的代码:

supp_outcomes$table_body <- supp_outcomes$table_body %>%
  mutate(extra1 = stat_0,
         extra2 = stat_1,
         extra3 = stat_2) %>%
  separate(extra1, c("number"), sep = ' \\(') %>%
  separate(extra2, c("number"), sep = ' \\(') %>%
  separate(extra3, c("number"), sep = ' \\(') %>%
  mutate(number = as.numeric(number)) %>%
  mutate(stat_0 = case_when(
    number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
    TRUE ~ stat_0),
    stat_1 = case_when(
      number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
      TRUE ~ stat_1),
    stat_2 = case_when(
      number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
      TRUE ~ stat_2)) %>%
  select(!number)

supp_outcomes
英文:

I have some sensitive info in my data that I need to hide below a certain threshold (to comply with DUA and prevent reidentifying data). I'm using tbl_svysmmary() from gtsummary. In my example, I like to filter "cell size ≤ 100":

library(gtsummary)
library(survey)

tbl_svysummary &lt;-
  svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %&gt;%
  tbl_svysummary(by = Survived, percent = &quot;row&quot;, include = c(Class, Age))

tbl_svysummary

如何从gtsummary表中隐藏特定单元格的信息


I want to show child info as:

如何从gtsummary表中隐藏特定单元格的信息


EDIT: Reproducible example for multiple stat_ columns and 0/1 variables:

library(gtsummary)
library(survey)

supp_outcomes &lt;-
  as.data.frame(Titanic) %&gt;% 
  mutate (Female=ifelse(Sex==&quot;Female&quot;,1,0)) %&gt;%   
  svydesign(~1, data = ., weights = ~Freq) %&gt;%
  tbl_svysummary(by = Survived, percent = &quot;row&quot;, 
                 include = c(Age, Female, Class)) %&gt;% 
  add_overall() %&gt;% add_p()

supp_outcomes

The edited code from @Marco's suggested solution:

supp_outcomes$table_body &lt;- supp_outcomes$table_body %&gt;% 
  mutate(extra1 = stat_0,
         extra2 = stat_1,
         extra3 = stat_2) %&gt;% 
  separate(extra1, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
separate(extra2, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
separate(  extra3, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
  mutate(number = as.numeric(number)) %&gt;% 
  mutate(stat_0 = case_when(
    number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
    TRUE ~ stat_0),
    stat_1 = case_when(
      number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
      TRUE ~ stat_1),
    stat_2 = case_when(
      number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
      TRUE ~ stat_2)) %&gt;% 
  select(!number)

supp_outcomes

答案1

得分: 1

你可以对输出对象进行更多的 tidyverse 操作,就像这样(使用 table_body):

library(tidyverse)
data(mtcars)

library(gtsummary)
output <- mtcars[,1:2] %>% tbl_summary() 

output$table_body

# 一个 tibble: 5 × 6
  variable var_type    var_label row_type label stat_0           
  <chr>    <chr>       <chr>     <chr>    <chr> <chr>            
1 mpg      continuous  mpg       label    mpg   19.2 (15.4, 22.8)
2 cyl      categorical cyl       label    cyl   NA               
3 cyl      categorical cyl       level    4     11 (34%)         
4 cyl      categorical cyl       level    6     7 (22%)          
5 cyl      categorical cyl       level    8     14 (44%)  

# 当 cyl 的 N 少于 10 时,不显示单元格信息
output$table_body <- output$table_body %>% 
  mutate(extra = stat_0) %>% 
  separate(extra, c("number"), sep = ' \\(') %>% 
  mutate(number = as.numeric(number)) %>% 
  mutate(stat_0 = case_when(number < 10 & var_type == "categorical" ~ "TOO FEW",
                            TRUE ~ stat_0)) %>% 
  select(!number)

output

case_when 中的筛选决策基于分类变量,当单元格信息少于阈值时,你可以控制它。你可以根据其他变量或条件进行调整。

英文:

You can do more tidyverse manipulation on the output object like this (using the table_body):

library(tidyverse)
data(mtcars)

library(gtsummary)
output &lt;- mtcars[,1:2] %&gt;% tbl_summary() 

output$table_body

# A tibble: 5 &#215; 6
  variable var_type    var_label row_type label stat_0           
  &lt;chr&gt;    &lt;chr&gt;       &lt;chr&gt;     &lt;chr&gt;    &lt;chr&gt; &lt;chr&gt;            
1 mpg      continuous  mpg       label    mpg   19.2 (15.4, 22.8)
2 cyl      categorical cyl       label    cyl   NA               
3 cyl      categorical cyl       level    4     11 (34%)         
4 cyl      categorical cyl       level    6     7 (22%)          
5 cyl      categorical cyl       level    8     14 (44%)  

# Don&#39;t show cell information when N of cyl is less than 10
output$table_body &lt;- output$table_body %&gt;% 
  mutate(extra = stat_0) %&gt;% 
  separate(extra, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
  mutate(number = as.numeric(number)) %&gt;% 
  mutate(stat_0 = case_when(number &lt; 10 &amp; var_type == &quot;categorical&quot; ~ &quot;TOO FEW&quot;,
                            TRUE ~ stat_0)) %&gt;% 
  select(!number)

output

The filter decision in case_when is based on categorical variables, where you like to control the cell information, when it is less than a threshold value. You can adjust this for any other variable or condition.

如何从gtsummary表中隐藏特定单元格的信息

huangapple
  • 本文由 发表于 2023年2月14日 08:16:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/75442357.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定