英文:
Why does kable in R undo an summarytools frequency option to not display cumulative percentage?
问题
我发现了一个奇怪的问题,或者可能是一个有意的特性,并试图找到一个更优雅的解决方案。欢迎任何想法。下面是可复现的代码示例,但是我在这里无法很好地重现表格。
情况:当我使用summarytools包中的freq()
函数时,我使用cumul=FALSE
参数来删除总累积结果,以获得更清晰的输出。这按预期工作。然而,当我将结果通过knitr和kableExtra包中的kable()
函数传递到一个表格中时,总累积结果会出现在kable表格中。不知道为什么会这样。
根据summarytools的创建者Dominic Comtois的页面这里:
summarytools对象并不总是与专注于表格格式化的包(如formattable或kableExtra)兼容。然而,
tb()
可以用作“桥梁”,将freq()
和descr()
对象转换为任何包都可以处理的简单表格。
我认为我的问题是所提到的兼容性问题的一部分。当我在freq()
和kable()
之间添加一个tb()
作为桥梁时,我的表格内容显示正常,没有总累积结果。然而,表头不再保留其原来的名称,而是使用tibble的分配列标题。此外,我丢失了底部的总计行。
到目前为止,我已经成功将tibble列名重命名为更接近原始freq()
输出的名称,但尚未恢复底部的总计行。我相信有比我迄今为止所做的更好的代码方法。提前谢谢!
# 设置包和数据
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))
# 初始频率输出
d |> freq(group, cumul=FALSE)
# 将频率结果移至kable,并添加总累积结果
d |> freq(group, cumul=FALSE) |>
kable() |> kable_classic(full_width=FALSE)
# 在kable之前将其转换为tibble会丢失总计行并恢复列标题
d |> freq(group, cumul=FALSE) |> tb() |>
kable() |> kable_classic(full_width=FALSE)
# 对列标题进行微小修正,但仍然没有总计行
d |> freq(group, cumul=FALSE) |> tb() |>
rename("Group"="group", "Freq"="freq", "% Valid"="pct_valid", "% Valid Cum."="pct_tot") |>
kable() |> kable_classic(full_width=FALSE)
英文:
I found an odd quirk or maybe an intentional feature and trying to find a more elegant solution. Any ideas are appreciated. Reproducible code example below, but I wasn't able to recreate the tables very well here with markdown.
Situation: When I use the freq()
function from the summarytools package, I remove the total cumulate results using the cumul=FALSE
argument for a cleaner output. This works as expected. However, when I pipe the results into a kable()
table from the knitr and kableExtra packages, the total cumulative results appear in the kable table. Not sure why.
According to summarytools creator Dominic Comtois' page here:
> summarytools objects are not always compatible with packages focused on table formatting, such as formattable or kableExtra. However, tb()
can be used as a “bridge”, an intermediary step turning freq()
and descr() objects into simple tables that any package can work with.
I assume my problem is part of the mentioned compatibility issue. When I add a tb()
as bridge between freq()
and kable()
, my table contents appear as they should without the total cumulative results. Yet, the table headings no longer retain their heading names and revert to using the tibble's assigned column headings. Also, I lose the Total row on the bottom.
So far, I managed to rename the tibble column names to be closer to the original freq()
output, but haven't yet restored the bottom Totals row. I assume there is a better code approach than what I've done thus far. Thanks in advance!
# set packages and data
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))
# initial frequency output
d |> freq(group, cumul=FALSE)
# move freq results in kable add the Total Cumulative back
d |> freq(group, cumul=FALSE) |>
kable() |> kable_classic(full_width=FALSE)
# changing to a tibble before kable drops the Totals row and reverts the column headings
d |> freq(group, cumul=FALSE) |> tb() |>
kable() |> kable_classic(full_width=FALSE)
# minor fix on the column headings but still not totals row
d |> freq(group, cumul=FALSE) |> tb() |>
rename("Group"="group", "Freq"="freq", "% Valid"="pct_valid", "% Valid Cum."="pct_tot")|>
kable() |> kable_classic(full_width=FALSE)
答案1
得分: 1
这是一个可能的但冗长的解决方案:
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))
d %>%
freq(group, cumul=FALSE) %>%
tb() %>%
(function(df){
bind_rows(df, data.frame(group = "Total", freq = sum(df$freq), pct_valid = 100, pct_tot = 100))
})() %>%
rename(Group = group, Freq = freq, `% Valid` = pct_valid , `% Total` = pct_tot ) %>%
kable() %>%
kable_classic(full_width=FALSE)
英文:
Here is a possible but verbose solution:
library(dplyr)
library(knitr)
library(summarytools)
library(kableExtra)
set.seed(99)
d <- data.frame(group=sample(LETTERS[1:3], size=100, replace=TRUE))
d |>
freq(group, cumul=FALSE) |>
tb() |>
(function(df){
bind_rows(df, data.frame(group = "Total", freq = sum(df$freq), pct_valid = 100, pct_tot = 100))
})() |>
rename(Group = group, Freq = freq, `% Valid` = pct_valid , `% Total` = pct_tot ) |>
kable() |>
kable_classic(full_width=FALSE)
答案2
得分: 0
以下是翻译好的内容:
这是一个更直接的解决方案:
set.seed(99)
d <- data.frame(group=sample(c(LETTERS[1:3], NA), size=500, replace=TRUE))
freq(d$group)[,-c(3,5)] %>%
kable(digits = 1) %>%
kable_classic(full_width = FALSE)
要添加“Group”列标题:
as.data.frame(freq(d$group)[,-c(3,5)]) %>%
tibble::rownames_to_column("Group") %>%
kable(digits = 1) %>%
kable_classic(full_width = FALSE)
kable()
显示累积列的原因是freq()
返回一个矩阵,该矩阵始终包含累积值;而是summarytools的print()
函数根据freq对象的属性确定要隐藏/显示哪些列和标题元素。(检查attributes(freq(d$group)
,你会明白我的意思)。
显示频率表时,如果要使用“valid”列,可以使用以下knitr/kable选项:
options(knitr.kable.NA = '')
这样,我们将在“NA”的位置上显示一个空单元格:
英文:
Here's a more straightforward solution:
set.seed(99)
d <- data.frame(group=sample(c(LETTERS[1:3], NA), size=500, replace=TRUE))
freq(d$group)[,-c(3,5)] |>
kable(digits = 1) |>
kable_classic(full_width = FALSE)
To have the "Group" column title:
as.data.frame(freq(d$group)[,-c(3,5)]) |>
tibble::rownames_to_column("Group") |>
kable(digits = 1) |>
kable_classic(full_width = FALSE)
The reason kable()
ends up displaying the cumulative columns is that freq()
returns a matrix which always contains cumulatives; it is summarytools' print()
function that determines which columns and heading elements to mask / show, based on the freq object's attributes. (Check attributes(freq(d$group)
, you'll see what I mean).
An extra tip for displaying frequency tables with a "valid" column: use the following knitr/kable options:
options(knitr.kable.NA = '')
This way, we have a blank cell in lieu of "NA":
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论