2023年4月19日 22:17:43go评论97阅读模式

英文:

Produce a descriptive statistics table separated by "±"

问题

I'm new. I need to produce a tibble where each variable grouped by a factor and described by mean and standard deviation separated by "±".

Let's use the iris dataset.

iris %>%
  group_by(Species) %>%
  summarise(across(everything(), list(Mean=mean,dev.st=sd))) %>%
  pivot_longer(cols = -Species, names_to = c(".value", "variable"), names_sep = "_")

How can I continue?
Thank you in advance

英文:

I'm new. I need to produce a tibble where each variable grouped by a factor and described by mean and standard deviation separated by "±".

Let's use the iris dataset.

iris %&gt;%
  group_by(Species) %&gt;%
  summarise(across(everything(), list(Mean=mean,dev.st=sd))) %&gt;% 
  pivot_longer(cols = -Species, names_to = c(&quot;.value&quot;, &quot;variable&quot;), names_sep = &quot;_&quot;)

How can I continue?
Thank you in advance

答案1

得分: 2

以下是您要翻译的内容：

You could use the more updated `dplyr::reframe` (which replaces `dplyr::summarize`) and add this combined summary statistic (`comb`) to your list of functions:

library(dplyr)
library(tidyr)

iris %>%
group_by(Species) %>%
reframe(across(everything(),
list(Mean = ~ as.character(mean(.x)),
dev.sd = ~ as.character(sd(.x)),
comb = ~ paste(mean(.x), sd(.x), sep = " ± ")))) %>%
pivot_longer(cols = -Species, names_to = c(".value", "variable"),
names_sep = "_")

(from comment) if you only wanted the combined column and want

them at two significant digits, you could adjust:

iris %>%
group_by(Species) %>%
reframe(across(everything(),
list(comb = ~ paste(sprintf("%.2f", mean(.x)),
sprintf("%.2f", sd(.x)), sep = " ± ")))) %>%
pivot_longer(cols = -Species, names_to = c(".value", "variable"),
names_sep = "_")

#' In this case you get the exact same thing if you replace reframe with
#' summarize, but the latter is being replaced by reframe
#' by dplyr moving forward

Note to combine with the `pivot_longer`, all elements need to be in the same class, so converted them to character. If you keep it wide, you dont have to add the `as.character()` bit in the summary stats.

输出如下：

  Species    variable Sepal.Length              Sepal.Width               Petal.Length              Petal.Width              
  &lt;fct&gt;      &lt;chr&gt;    &lt;chr&gt;                     &lt;chr&gt;                     &lt;chr&gt;                     &lt;chr&gt;                    
1 setosa     Mean     5.006                     3.428                     1.462                     0.246                    
2 setosa     dev.sd   0.352489687213451         0.379064369096289         0.173663996480184         0.105385589380046        
3 setosa     comb     5.006 &#177; 0.352489687213451 3.428 &#177; 0.379064369096289 1.462 &#177; 0.173663996480184 0.246 &#177; 0.105385589380046
4 versicolor Mean     5.936                     2.77                      4.26                      1.326                    
5 versicolor dev.sd   0.516171147063863         0.313798323378411         0.469910977239958         0.197752680004544        
6 versicolor comb     5.936 &#177; 0.516171147063863 2.77 &#177; 0.313798323378411  4.26 &#177; 0.469910977239958  1.326 &#177; 0.197752680004544
7 virginica  Mean     6.588                     2.974                     5.552                     2.026                    
8 virginica  dev.sd   0.635879593274432         0.322496638172637         0.551894695663983         0.274650055636667        
9 virginica  comb     6.588 &#177; 0.635879593274432 2.974 &#177; 0.322496638172637 5.552 &#177; 0.551894695663983 2.026 &#177; 0.274650055636667


<details>
<summary>英文:</summary>
You could use the more updated `dplyr::reframe` (which replaces `dplyr::summarize`) and add this combined summary statistic (`comb`) to your list of functions:

library(dplyr)
library(tidyr)

(from comment) if you only wanted the combined column and want

them at two significant digits, you could adjust:

#' In this case you get the exact same thing if you replace reframe with
#' summarize, but the latter is being replaced by reframe
#' by dplyr moving forward

Note to combine with the `pivot_longer`, all elements need to be in the same class, so converted them to character. If you keep it wide, you dont have to add the `as.character()` bit in the summary stats.
Output

Species variable Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <chr> <chr> <chr> <chr> <chr>
1 setosa Mean 5.006 3.428 1.462 0.246
2 setosa dev.sd 0.352489687213451 0.379064369096289 0.173663996480184 0.105385589380046
3 setosa comb 5.006 ± 0.352489687213451 3.428 ± 0.379064369096289 1.462 ± 0.173663996480184 0.246 ± 0.105385589380046
4 versicolor Mean 5.936 2.77 4.26 1.326
5 versicolor dev.sd 0.516171147063863 0.313798323378411 0.469910977239958 0.197752680004544
6 versicolor comb 5.936 ± 0.516171147063863 2.77 ± 0.313798323378411 4.26 ± 0.469910977239958 1.326 ± 0.197752680004544
7 virginica Mean 6.588 2.974 5.552 2.026
8 virginica dev.sd 0.635879593274432 0.322496638172637 0.551894695663983 0.274650055636667
9 virginica comb 6.588 ± 0.635879593274432 2.974 ± 0.322496638172637 5.552 ± 0.551894695663983 2.026 ± 0.274650055636667


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

生成一个以”±”分隔的描述性统计表。

问题

答案1

(from comment) if you only wanted the combined column and want

them at two significant digits, you could adjust:

(from comment) if you only wanted the combined column and want

them at two significant digits, you could adjust:

粗斜体的大写希腊字母在数学中

Rvest提取空表格

无法使用Hugo和blogdown创建新文章。

如何在R中将具有多个分类变量的宽格式数据转换为长格式？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论