为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

huangapple go评论82阅读模式
英文:

Why does kable in R undo an summarytools frequency option to not display cumulative percentage?

问题

我发现了一个奇怪的问题,或者可能是一个有意的特性,并试图找到一个更优雅的解决方案。欢迎任何想法。下面是可复现的代码示例,但是我在这里无法很好地重现表格。

情况:当我使用summarytools包中的freq()函数时,我使用cumul=FALSE参数来删除总累积结果,以获得更清晰的输出。这按预期工作。然而,当我将结果通过knitrkableExtra包中的kable()函数传递到一个表格中时,总累积结果会出现在kable表格中。不知道为什么会这样。

根据summarytools的创建者Dominic Comtois的页面这里

summarytools对象并不总是与专注于表格格式化的包(如formattablekableExtra)兼容。然而,tb()可以用作“桥梁”,将freq()descr()对象转换为任何包都可以处理的简单表格。

我认为我的问题是所提到的兼容性问题的一部分。当我在freq()kable()之间添加一个tb()作为桥梁时,我的表格内容显示正常,没有总累积结果。然而,表头不再保留其原来的名称,而是使用tibble的分配列标题。此外,我丢失了底部的总计行。

到目前为止,我已经成功将tibble列名重命名为更接近原始freq()输出的名称,但尚未恢复底部的总计行。我相信有比我迄今为止所做的更好的代码方法。提前谢谢!

# 设置包和数据
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))

# 初始频率输出
d |> freq(group, cumul=FALSE)

# 将频率结果移至kable,并添加总累积结果
d |> freq(group, cumul=FALSE) |> 
     kable() |> kable_classic(full_width=FALSE)

# 在kable之前将其转换为tibble会丢失总计行并恢复列标题
d |> freq(group, cumul=FALSE) |> tb() |> 
     kable() |> kable_classic(full_width=FALSE)

# 对列标题进行微小修正,但仍然没有总计行
d |> freq(group, cumul=FALSE) |> tb() |> 
     rename("Group"="group", "Freq"="freq", "% Valid"="pct_valid", "% Valid Cum."="pct_tot") |> 
     kable() |> kable_classic(full_width=FALSE)
英文:

I found an odd quirk or maybe an intentional feature and trying to find a more elegant solution. Any ideas are appreciated. Reproducible code example below, but I wasn't able to recreate the tables very well here with markdown.

Situation: When I use the freq() function from the summarytools package, I remove the total cumulate results using the cumul=FALSE argument for a cleaner output. This works as expected. However, when I pipe the results into a kable() table from the knitr and kableExtra packages, the total cumulative results appear in the kable table. Not sure why.

According to summarytools creator Dominic Comtois' page here:

> summarytools objects are not always compatible with packages focused on table formatting, such as formattable or kableExtra. However, tb() can be used as a “bridge”, an intermediary step turning freq() and descr() objects into simple tables that any package can work with.

I assume my problem is part of the mentioned compatibility issue. When I add a tb() as bridge between freq() and kable(), my table contents appear as they should without the total cumulative results. Yet, the table headings no longer retain their heading names and revert to using the tibble's assigned column headings. Also, I lose the Total row on the bottom.

So far, I managed to rename the tibble column names to be closer to the original freq() output, but haven't yet restored the bottom Totals row. I assume there is a better code approach than what I've done thus far. Thanks in advance!

# set packages and data
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d &lt;- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))

# initial frequency output
d |&gt; freq(group, cumul=FALSE)

# move freq results in kable add the Total Cumulative back
d |&gt; freq(group, cumul=FALSE) |&gt; 
     kable() |&gt; kable_classic(full_width=FALSE)

# changing to a tibble before kable drops the Totals row and reverts the column headings
d |&gt; freq(group, cumul=FALSE) |&gt; tb() |&gt; 
     kable() |&gt; kable_classic(full_width=FALSE)

# minor fix on the column headings but still not totals row
d |&gt; freq(group, cumul=FALSE) |&gt; tb() |&gt; 
     rename(&quot;Group&quot;=&quot;group&quot;, &quot;Freq&quot;=&quot;freq&quot;, &quot;% Valid&quot;=&quot;pct_valid&quot;, &quot;% Valid Cum.&quot;=&quot;pct_tot&quot;)|&gt; 
     kable() |&gt; kable_classic(full_width=FALSE)

答案1

得分: 1

这是一个可能的但冗长的解决方案:

library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)

set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))

d %>%
  freq(group, cumul=FALSE) %>%
  tb() %>%
  (function(df){
    bind_rows(df, data.frame(group = "Total", freq = sum(df$freq), pct_valid = 100, pct_tot = 100))
  })() %>%
  rename(Group = group, Freq = freq, `% Valid` = pct_valid , `% Total` = pct_tot ) %>%
  kable() %>%
  kable_classic(full_width=FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

英文:

Here is a possible but verbose solution:

library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)

set.seed(99)
d &lt;- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))

d |&gt; 
  freq(group, cumul=FALSE) |&gt; 
  tb() |&gt; 
  (function(df){
    bind_rows(df, data.frame(group = &quot;Total&quot;, freq = sum(df$freq), pct_valid = 100, pct_tot = 100))
  })() |&gt;
  rename(Group = group, Freq = freq, `% Valid` = pct_valid , `% Total` = pct_tot ) |&gt; 
  kable() |&gt; 
  kable_classic(full_width=FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

答案2

得分: 0

以下是翻译好的内容:

这是一个更直接的解决方案:

set.seed(99)
d <- data.frame(group=sample(c(LETTERS[1:3], NA), size=500, replace=TRUE))

freq(d$group)[,-c(3,5)] %>%
  kable(digits = 1) %>%
  kable_classic(full_width = FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

要添加“Group”列标题:

as.data.frame(freq(d$group)[,-c(3,5)]) %>%
  tibble::rownames_to_column("Group") %>%
  kable(digits = 1) %>%
  kable_classic(full_width = FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

kable()显示累积列的原因是freq()返回一个矩阵,该矩阵始终包含累积值;而是summarytoolsprint()函数根据freq对象的属性确定要隐藏/显示哪些列和标题元素。(检查attributes(freq(d$group),你会明白我的意思)。

显示频率表时,如果要使用“valid”列,可以使用以下knitr/kable选项:

options(knitr.kable.NA = '')

这样,我们将在“NA”的位置上显示一个空单元格:

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

英文:

Here's a more straightforward solution:

set.seed(99)
d &lt;- data.frame(group=sample(c(LETTERS[1:3], NA), size=500, replace=TRUE))

freq(d$group)[,-c(3,5)] |&gt;
  kable(digits = 1) |&gt;
  kable_classic(full_width = FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

To have the "Group" column title:

as.data.frame(freq(d$group)[,-c(3,5)]) |&gt; 
  tibble::rownames_to_column(&quot;Group&quot;) |&gt;
  kable(digits = 1) |&gt;
  kable_classic(full_width = FALSE)

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

The reason kable() ends up displaying the cumulative columns is that freq() returns a matrix which always contains cumulatives; it is summarytools' print() function that determines which columns and heading elements to mask / show, based on the freq object's attributes. (Check attributes(freq(d$group), you'll see what I mean).

An extra tip for displaying frequency tables with a "valid" column: use the following knitr/kable options:

options(knitr.kable.NA = &#39;&#39;)

This way, we have a blank cell in lieu of "NA":

为什么在R中使用kable函数会取消summarytools频率选项,不显示累积百分比?

huangapple
  • 本文由 发表于 2023年8月9日 00:38:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76861581.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定