dplyr 根据条件总结多个变量。

huangapple go评论52阅读模式
英文:

dplyr summarise multiple variables based on condition

问题

I would like to summarise data as mean and sd, or median and lower and upper quartile, depending on if they are normally distributed or not.

Using mtcars as an example, this is how I am doing one variable at a time:

sum= mtcars %>%
group_by(am) %>%
summarise(MPG = paste0(mean(qsec), " (", sd(sec), ")")

I'd like to do something like this

norm = c("qsec", "drat", "hp", "mpg")

sum= mtcars %>%
group_by(am) %>%
summarise(across(where(. %in% norm), . = paste0(mean(., na.rm = T), " (", sd(., na.rm = T) , ")"))
)

and add the relevant line for median and quartiles.
Would also be happy with a for loop solution and then ? rbind.

英文:

I would like to summarise data as mean and sd, or median and lower and upper quartile, depending on if they are normally distributed or not.

Using mtcars as an example, this is how I am doing one variable at a time:


sum= mtcars%>%
group_by(am)%>%
summarise(MPG = paste0(mean(qsec), " (", sd(sec), ")")

I'd like to do something like this

norm = c("qsec", "drat", "hp", "mpg")

sum= mtcars%>%
group_by(am)%>%
summarise(across(where(. %in% norm), . = paste0(mean(.,na.rm = T), " (", sd(.,na.rm=T) , ")") )
            )

and add the relevant line for median and quartiles.
Would also be happy with a for loop solution and then ? rbind.

答案1

得分: 3

我想你想要做类似这样的事情:

library("dplyr")

norm <- c("qsec", "drat", "hp", "mpg")

my_summary <- mtcars %>%
  group_by(am) %>%
  summarise(
    across(
      all_of(norm),
      ~ paste0(mean(.x, na.rm = TRUE), "(sd=", sd(.x, na.rm = TRUE), ")")
    ),
    across(
      !all_of(norm),
      ~ paste0(median(.x, na.rm = TRUE), "(", quantile(.x, 1/4), " - ", quantile(.x, 3/4), ")")
    )
  )

你可以简单地使用 all_ofnorm 中选择你想要的列,或者对它取反。

英文:

I suppose you want to do something like this:

library(&quot;dplyr&quot;)

norm &lt;- c(&quot;qsec&quot;, &quot;drat&quot;, &quot;hp&quot;, &quot;mpg&quot;)

my_summary &lt;- mtcars |&gt;
  group_by(am) |&gt;
  summarise(
    across(
      all_of(norm),
      ~ paste0(mean(.x, na.rm = TRUE), &quot;(sd=&quot;, sd(.x, na.rm = TRUE), &quot;)&quot;)
    ),
    across(
      !all_of(norm),
      ~ paste0(median(.x, na.rm = TRUE), &quot;(&quot;, quantile(.x, 1/4), &quot; - &quot;, quantile(.x, 3/4), &quot;)&quot;)
    )
  )

You can simply use all_of to select the columns you want from norm or negate it.

huangapple
  • 本文由 发表于 2023年6月26日 17:06:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76555183.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定