2023年2月14日 08:16:13go评论91阅读模式

英文:

How to mask certain cell's info from a gtsummary table

问题

我有一些敏感信息在我的数据中，我需要将其隐藏在某个特定的阈值以下（以符合数据使用协议并防止重新识别数据）。我正在使用gtsummary中的tbl_svysmmary()。在我的示例中，我想要过滤"cell size ≤ 100"：

library(gtsummary)
library(survey)
tbl_svysummary <-
  svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>%
  tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age))
tbl_svysummary

我想显示儿童信息如下：

编辑：用于多个stat_列和0/1变量的可重现示例：

library(gtsummary)
library(survey)
supp_outcomes <-
  as.data.frame(Titanic) %>%
  mutate(Female = ifelse(Sex == "Female", 1, 0)) %>%
  svydesign(~1, data = ., weights = ~Freq) %>%
  tbl_svysummary(by = Survived, percent = "row", 
                 include = c(Age, Female, Class)) %>%
  add_overall() %>%
  add_p()
supp_outcomes

从@Marco建议的解决方案中编辑的代码：

supp_outcomes$table_body <- supp_outcomes$table_body %>%
  mutate(extra1 = stat_0,
         extra2 = stat_1,
         extra3 = stat_2) %>%
  separate(extra1, c("number"), sep = ' \\(') %>%
  separate(extra2, c("number"), sep = ' \\(') %>%
  separate(extra3, c("number"), sep = ' \\(') %>%
  mutate(number = as.numeric(number)) %>%
  mutate(stat_0 = case_when(
    number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
    TRUE ~ stat_0),
    stat_1 = case_when(
      number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
      TRUE ~ stat_1),
    stat_2 = case_when(
      number < 200 & number > 0 & var_type %in% c("dichotomous", "categorical") ~ "TOO FEW",
      TRUE ~ stat_2)) %>%
  select(!number)
supp_outcomes

英文:

I have some sensitive info in my data that I need to hide below a certain threshold (to comply with DUA and prevent reidentifying data). I'm using tbl_svysmmary() from gtsummary. In my example, I like to filter "cell size ≤ 100":

library(gtsummary)
library(survey)
tbl_svysummary &lt;-
  svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %&gt;%
  tbl_svysummary(by = Survived, percent = &quot;row&quot;, include = c(Class, Age))
tbl_svysummary

I want to show child info as:

EDIT: Reproducible example for multiple stat_ columns and 0/1 variables:

library(gtsummary)
library(survey)
supp_outcomes &lt;-
  as.data.frame(Titanic) %&gt;% 
  mutate (Female=ifelse(Sex==&quot;Female&quot;,1,0)) %&gt;%   
  svydesign(~1, data = ., weights = ~Freq) %&gt;%
  tbl_svysummary(by = Survived, percent = &quot;row&quot;, 
                 include = c(Age, Female, Class)) %&gt;% 
  add_overall() %&gt;% add_p()
supp_outcomes

The edited code from @Marco's suggested solution:

supp_outcomes$table_body &lt;- supp_outcomes$table_body %&gt;% 
  mutate(extra1 = stat_0,
         extra2 = stat_1,
         extra3 = stat_2) %&gt;% 
  separate(extra1, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
separate(extra2, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
separate(  extra3, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
  mutate(number = as.numeric(number)) %&gt;% 
  mutate(stat_0 = case_when(
    number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
    TRUE ~ stat_0),
    stat_1 = case_when(
      number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
      TRUE ~ stat_1),
    stat_2 = case_when(
      number &lt; 200 &amp; number &gt; 0 &amp; var_type %in%c(&quot;dichotomous&quot;, &quot;categorical&quot;)~ &quot;TOO FEW&quot;,
      TRUE ~ stat_2)) %&gt;% 
  select(!number)
supp_outcomes

答案1

得分: 1

你可以对输出对象进行更多的 tidyverse 操作，就像这样（使用 table_body）：

library(tidyverse)
data(mtcars)
library(gtsummary)
output <- mtcars[,1:2] %>% tbl_summary() 
output$table_body
# 一个 tibble: 5 × 6
  variable var_type    var_label row_type label stat_0           
  <chr>    <chr>       <chr>     <chr>    <chr> <chr>            
1 mpg      continuous  mpg       label    mpg   19.2 (15.4, 22.8)
2 cyl      categorical cyl       label    cyl   NA               
3 cyl      categorical cyl       level    4     11 (34%)         
4 cyl      categorical cyl       level    6     7 (22%)          
5 cyl      categorical cyl       level    8     14 (44%)  
# 当 cyl 的 N 少于 10 时，不显示单元格信息
output$table_body <- output$table_body %>% 
  mutate(extra = stat_0) %>% 
  separate(extra, c("number"), sep = ' \\(') %>% 
  mutate(number = as.numeric(number)) %>% 
  mutate(stat_0 = case_when(number < 10 & var_type == "categorical" ~ "TOO FEW",
                            TRUE ~ stat_0)) %>% 
  select(!number)
output

case_when 中的筛选决策基于分类变量，当单元格信息少于阈值时，你可以控制它。你可以根据其他变量或条件进行调整。

英文:

You can do more tidyverse manipulation on the output object like this (using the table_body):

library(tidyverse)
data(mtcars)
library(gtsummary)
output &lt;- mtcars[,1:2] %&gt;% tbl_summary() 
output$table_body
# A tibble: 5 &#215; 6
  variable var_type    var_label row_type label stat_0           
  &lt;chr&gt;    &lt;chr&gt;       &lt;chr&gt;     &lt;chr&gt;    &lt;chr&gt; &lt;chr&gt;            
1 mpg      continuous  mpg       label    mpg   19.2 (15.4, 22.8)
2 cyl      categorical cyl       label    cyl   NA               
3 cyl      categorical cyl       level    4     11 (34%)         
4 cyl      categorical cyl       level    6     7 (22%)          
5 cyl      categorical cyl       level    8     14 (44%)  
# Don&#39;t show cell information when N of cyl is less than 10
output$table_body &lt;- output$table_body %&gt;% 
  mutate(extra = stat_0) %&gt;% 
  separate(extra, c(&quot;number&quot;), sep = &#39; \\(&#39;) %&gt;% 
  mutate(number = as.numeric(number)) %&gt;% 
  mutate(stat_0 = case_when(number &lt; 10 &amp; var_type == &quot;categorical&quot; ~ &quot;TOO FEW&quot;,
                            TRUE ~ stat_0)) %&gt;% 
  select(!number)
output

The filter decision in case_when is based on categorical variables, where you like to control the cell information, when it is less than a threshold value. You can adjust this for any other variable or condition.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从gtsummary表中隐藏特定单元格的信息

问题

答案1

从不同数据集中的多个组中减去多个值的总和

如何根据数据帧中的分组来增加数字距离？

拟合多个零膨胀负二项模型并获取汇总结果

问题：在R中使用bind_rows时出现向量大小问题。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。