在R中向多个分组的小提琴图中添加中位数和四分位范围。

huangapple go评论61阅读模式
英文:

Adding median and quartile range to multiple grouped violin plot in R

问题

我在R中使用以下代码绘制了一个小提琴图,用于显示5个变量("CAP1-5")在3个组(BP、BPoff、HC)之间的分布。这段代码有效:

ggplot(data, aes(x = CAP, y = Value, fill = GROUP)) +
  geom_violin(scale = "width", trim = FALSE) +
  scale_fill_manual(values = c("BP" = "red", "BPoff" = "grey", "HC" = "white")) +
  xlab("CAP") +
  ylab("Value") +
  theme_minimal() +
  facet_wrap(~ CAP, scales = "free_x", nrow = 1)

但我想在每个小提琴图中插入中位数和四分位数,我成功地在每个变量的中间小提琴图中添加了下面的代码行,您可以这样做:

geom_point(data = summary_data, aes(x = CAP, y = median), shape = 23, size = 3, fill = "white") +
  geom_errorbar(data = summary_data, aes(x = CAP, ymin = lower, ymax = upper), width = 0.2, color = "black")

谢谢!
Best

英文:

I use the following code in R to produce a violin plot for 5 variables ("CAP1-5") across 3 groups (BP, BPoff, HC). This code worked:

ggplot(data, aes(x = CAP, y = Value, fill = GROUP)) +
  geom_violin(scale = "width", trim = FALSE) +
  scale_fill_manual(values = c("BP" = "red", "BPoff" = "grey", "HC" = "white")) +
  xlab("CAP") +
  ylab("Value") +
  theme_minimal() +
  facet_wrap(~ CAP, scales = "free_x", nrow = 1)
(output attached). 

But I would like to insert median and quartile to each violin plot, and I manage to add it only in the middle one for each variable adding the lines of code below, how can I do this?

geom_point(data = summary_data, aes(x = CAP, y = median), shape = 23, size = 3, fill = "white") +
  geom_errorbar(data = summary_data, aes(x = CAP, ymin = lower, ymax = upper), width = 0.2, color = "black") +

Thank you so much!
Best
在R中向多个分组的小提琴图中添加中位数和四分位范围。

答案1

得分: 0

有几种方法可以将四分位数添加到您的图表中。

第一种方法是在 geom_violin 中使用 draw_quartiles 参数

library(tidyverse)

df <- data.frame(group=rep(LETTERS[1:3], each=100),
                 var=rep(c("CAP1","CAP2","CAP3","CAP4","CAP5"), times=60),
                 value=rnorm(300))

summ_df <- df %>% group_by(group, var) %>% summarize(median=median(value),
                                                     Lower=quantile(value, probs=0.25),
                                                     Upper=quantile(value, probs=0.75)) %>%
  pivot_longer(cols = median:Upper, names_to = "quantile", values_to = "estimate")

df %>% ggplot(aes(x=var, y=value, fill=group)) + 
  geom_violin(scale="width", trim=FALSE, draw_quantiles = c(0.25, 0.5, 0.75))

第二种方法是使用包含摘要估算值的数据框和 position_dodge 来使用 geom_point 适当间隔它们。

df %>% ggplot(aes(x=var, y=value, fill=group)) + 
  geom_violin(scale="width", trim=FALSE) + 
  geom_point(data = summ_df, aes(x=var, y=estimate, group=group), 
             position = position_dodge(width=0.9))

第三种方法是使用 stat_summaryHmisc

stat_sum_df <- function(fun, geom="crossbar", ...) {
  stat_summary(fun.data = fun, colour = "red", geom = geom, width = 0.2, ...)
}

df %>% ggplot(aes(x=var, y=value, fill=group)) + 
     geom_violin(scale="width", trim=FALSE) + 
  stat_sum_df("median_hilow", mapping = aes(group = group), position=position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

在R中向多个分组的小提琴图中添加中位数和四分位范围。

在R中向多个分组的小提琴图中添加中位数和四分位范围。

英文:

There are a few options to add quartiles to your plot.

first is to use the draw_quartiles parameter in geom_violin_plot

library(tidyverse)

df &lt;- data.frame(group=rep(LETTERS[1:3], each=100),
                 var=rep(c(&quot;CAP1&quot;,&quot;CAP2&quot;,&quot;CAP3&quot;,&quot;CAP4&quot;,&quot;CAP5&quot;), times=60),
                 value=rnorm(300))

summ_df &lt;- df %&gt;% group_by(group, var) %&gt;% summarize(median=median(value),
                                                     Lower=quantile(value, probs=0.25),
                                                     Upper=quantile(value, probs=0.75)) %&gt;% 
  pivot_longer(cols = median:Upper, names_to = &quot;quantile&quot;, values_to = &quot;estimate&quot;)

df %&gt;% ggplot(aes(x=var, y=value, fill=group)) + 
  geom_violin(scale=&quot;width&quot;, trim=FALSE, draw_quantiles = c(0.25, 0.5, 0.75))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

second, is to use geom_point with a dataframe containing the summary estimates and position_dodge to get them spaced appropriately

df %&gt;% ggplot(aes(x=var, y=value, fill=group)) + 
  geom_violin(scale=&quot;width&quot;, trim=FALSE) + 
  geom_point(data = summ_df, aes(x=var, y=estimate, group=group), 
             position = position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

third is to use stat_summary and Hmisc

stat_sum_df &lt;- function(fun, geom=&quot;crossbar&quot;, ...) {
  stat_summary(fun.data = fun, colour = &quot;red&quot;, geom = geom, width = 0.2, ...)
}

df %&gt;% ggplot(aes(x=var, y=value, fill=group)) + 
     geom_violin(scale=&quot;width&quot;, trim=FALSE) + 
  stat_sum_df(&quot;median_hilow&quot;, mapping = aes(group = group), position=position_dodge(width=0.9))

在R中向多个分组的小提琴图中添加中位数和四分位范围。

huangapple
  • 本文由 发表于 2023年7月10日 22:15:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654636.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定