2023年6月26日 17:06:27go评论92阅读模式

英文:

dplyr summarise multiple variables based on condition

问题

I would like to summarise data as mean and sd, or median and lower and upper quartile, depending on if they are normally distributed or not.

Using mtcars as an example, this is how I am doing one variable at a time:

sum= mtcars %>%
group_by(am) %>%
summarise(MPG = paste0(mean(qsec), " (", sd(sec), ")")

I'd like to do something like this

norm = c("qsec", "drat", "hp", "mpg")
sum= mtcars %>%
group_by(am) %>%
summarise(across(where(. %in% norm), . = paste0(mean(., na.rm = T), " (", sd(., na.rm = T) , ")"))
)

and add the relevant line for median and quartiles.
Would also be happy with a for loop solution and then ? rbind.

英文:

I would like to summarise data as mean and sd, or median and lower and upper quartile, depending on if they are normally distributed or not.

Using mtcars as an example, this is how I am doing one variable at a time:


sum= mtcars%&gt;%
group_by(am)%&gt;%
summarise(MPG = paste0(mean(qsec), &quot; (&quot;, sd(sec), &quot;)&quot;)

I'd like to do something like this

norm = c(&quot;qsec&quot;, &quot;drat&quot;, &quot;hp&quot;, &quot;mpg&quot;)
sum= mtcars%&gt;%
group_by(am)%&gt;%
summarise(across(where(. %in% norm), . = paste0(mean(.,na.rm = T), &quot; (&quot;, sd(.,na.rm=T) , &quot;)&quot;) )
            )

and add the relevant line for median and quartiles.
Would also be happy with a for loop solution and then ? rbind.

答案1

得分: 3

我想你想要做类似这样的事情：

library("dplyr")
norm <- c("qsec", "drat", "hp", "mpg")
my_summary <- mtcars %>%
  group_by(am) %>%
  summarise(
    across(
      all_of(norm),
      ~ paste0(mean(.x, na.rm = TRUE), "(sd=", sd(.x, na.rm = TRUE), ")")
    ),
    across(
      !all_of(norm),
      ~ paste0(median(.x, na.rm = TRUE), "(", quantile(.x, 1/4), " - ", quantile(.x, 3/4), ")")
    )
  )

你可以简单地使用 all_of 从 norm 中选择你想要的列，或者对它取反。

英文:

I suppose you want to do something like this:

library(&quot;dplyr&quot;)
norm &lt;- c(&quot;qsec&quot;, &quot;drat&quot;, &quot;hp&quot;, &quot;mpg&quot;)
my_summary &lt;- mtcars |&gt;
  group_by(am) |&gt;
  summarise(
    across(
      all_of(norm),
      ~ paste0(mean(.x, na.rm = TRUE), &quot;(sd=&quot;, sd(.x, na.rm = TRUE), &quot;)&quot;)
    ),
    across(
      !all_of(norm),
      ~ paste0(median(.x, na.rm = TRUE), &quot;(&quot;, quantile(.x, 1/4), &quot; - &quot;, quantile(.x, 3/4), &quot;)&quot;)
    )
  )

You can simply use all_of to select the columns you want from norm or negate it.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

dplyr 根据条件总结多个变量。

问题

答案1

如何使用R从百分比数值的调查数据绘制一个中性居中的Likert图？

R Shiny App generate tabPanel in lapply (and unlist behaviour)

如何使用 ggplot 绘制 F 统计量和 p 值

无法使用Hugo和blogdown创建新文章。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。