在R中向多个分组的小提琴图中添加中位数和四分位范围。

huangapple go评论79阅读模式
英文:

Adding median and quartile range to multiple grouped violin plot in R

问题

我在R中使用以下代码绘制了一个小提琴图,用于显示5个变量("CAP1-5")在3个组(BP、BPoff、HC)之间的分布。这段代码有效:

  1. ggplot(data, aes(x = CAP, y = Value, fill = GROUP)) +
  2. geom_violin(scale = "width", trim = FALSE) +
  3. scale_fill_manual(values = c("BP" = "red", "BPoff" = "grey", "HC" = "white")) +
  4. xlab("CAP") +
  5. ylab("Value") +
  6. theme_minimal() +
  7. facet_wrap(~ CAP, scales = "free_x", nrow = 1)

但我想在每个小提琴图中插入中位数和四分位数,我成功地在每个变量的中间小提琴图中添加了下面的代码行,您可以这样做:

  1. geom_point(data = summary_data, aes(x = CAP, y = median), shape = 23, size = 3, fill = "white") +
  2. geom_errorbar(data = summary_data, aes(x = CAP, ymin = lower, ymax = upper), width = 0.2, color = "black")

谢谢!
Best

英文:

I use the following code in R to produce a violin plot for 5 variables ("CAP1-5") across 3 groups (BP, BPoff, HC). This code worked:

  1. ggplot(data, aes(x = CAP, y = Value, fill = GROUP)) +
  2. geom_violin(scale = "width", trim = FALSE) +
  3. scale_fill_manual(values = c("BP" = "red", "BPoff" = "grey", "HC" = "white")) +
  4. xlab("CAP") +
  5. ylab("Value") +
  6. theme_minimal() +
  7. facet_wrap(~ CAP, scales = "free_x", nrow = 1)
  8. (output attached).

But I would like to insert median and quartile to each violin plot, and I manage to add it only in the middle one for each variable adding the lines of code below, how can I do this?

  1. geom_point(data = summary_data, aes(x = CAP, y = median), shape = 23, size = 3, fill = "white") +
  2. geom_errorbar(data = summary_data, aes(x = CAP, ymin = lower, ymax = upper), width = 0.2, color = "black") +

Thank you so much!
Best
在R中向多个分组的小提琴图中添加中位数和四分位范围。

答案1

得分: 0

有几种方法可以将四分位数添加到您的图表中。

第一种方法是在 geom_violin 中使用 draw_quartiles 参数

  1. library(tidyverse)
  2. df <- data.frame(group=rep(LETTERS[1:3], each=100),
  3. var=rep(c("CAP1","CAP2","CAP3","CAP4","CAP5"), times=60),
  4. value=rnorm(300))
  5. summ_df <- df %>% group_by(group, var) %>% summarize(median=median(value),
  6. Lower=quantile(value, probs=0.25),
  7. Upper=quantile(value, probs=0.75)) %>%
  8. pivot_longer(cols = median:Upper, names_to = "quantile", values_to = "estimate")
  9. df %>% ggplot(aes(x=var, y=value, fill=group)) +
  10. geom_violin(scale="width", trim=FALSE, draw_quantiles = c(0.25, 0.5, 0.75))

第二种方法是使用包含摘要估算值的数据框和 position_dodge 来使用 geom_point 适当间隔它们。

  1. df %>% ggplot(aes(x=var, y=value, fill=group)) +
  2. geom_violin(scale="width", trim=FALSE) +
  3. geom_point(data = summ_df, aes(x=var, y=estimate, group=group),
  4. position = position_dodge(width=0.9))

第三种方法是使用 stat_summaryHmisc

  1. stat_sum_df <- function(fun, geom="crossbar", ...) {
  2. stat_summary(fun.data = fun, colour = "red", geom = geom, width = 0.2, ...)
  3. }
  4. df %>% ggplot(aes(x=var, y=value, fill=group)) +
  5. geom_violin(scale="width", trim=FALSE) +
  6. stat_sum_df("median_hilow", mapping = aes(group = group), position=position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

在R中向多个分组的小提琴图中添加中位数和四分位范围。

在R中向多个分组的小提琴图中添加中位数和四分位范围。

英文:

There are a few options to add quartiles to your plot.

first is to use the draw_quartiles parameter in geom_violin_plot

  1. library(tidyverse)
  2. df &lt;- data.frame(group=rep(LETTERS[1:3], each=100),
  3. var=rep(c(&quot;CAP1&quot;,&quot;CAP2&quot;,&quot;CAP3&quot;,&quot;CAP4&quot;,&quot;CAP5&quot;), times=60),
  4. value=rnorm(300))
  5. summ_df &lt;- df %&gt;% group_by(group, var) %&gt;% summarize(median=median(value),
  6. Lower=quantile(value, probs=0.25),
  7. Upper=quantile(value, probs=0.75)) %&gt;%
  8. pivot_longer(cols = median:Upper, names_to = &quot;quantile&quot;, values_to = &quot;estimate&quot;)
  9. df %&gt;% ggplot(aes(x=var, y=value, fill=group)) +
  10. geom_violin(scale=&quot;width&quot;, trim=FALSE, draw_quantiles = c(0.25, 0.5, 0.75))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

second, is to use geom_point with a dataframe containing the summary estimates and position_dodge to get them spaced appropriately

  1. df %&gt;% ggplot(aes(x=var, y=value, fill=group)) +
  2. geom_violin(scale=&quot;width&quot;, trim=FALSE) +
  3. geom_point(data = summ_df, aes(x=var, y=estimate, group=group),
  4. position = position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

third is to use stat_summary and Hmisc

  1. stat_sum_df &lt;- function(fun, geom=&quot;crossbar&quot;, ...) {
  2. stat_summary(fun.data = fun, colour = &quot;red&quot;, geom = geom, width = 0.2, ...)
  3. }
  4. df %&gt;% ggplot(aes(x=var, y=value, fill=group)) +
  5. geom_violin(scale=&quot;width&quot;, trim=FALSE) +
  6. stat_sum_df(&quot;median_hilow&quot;, mapping = aes(group = group), position=position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

huangapple
  • 本文由 发表于 2023年7月10日 22:15:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654636.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定