创建一个分组箱线图,每个分组列有不同数量的行?

huangapple go评论58阅读模式
英文:

Creating a grouped boxplot with different numbers of rows for each grouped column?

问题

我有数据想要在分组箱线图中比较,即比较每种治疗的前后反应。问题是每种治疗类型的试验次数不同,所以我无法创建一个数据框架(我在数据框架中遇到错误)。

为了澄清,我想要在每个组内比较前后数值,但让每个组都显示在同一图中,以便我可以跨组比较统计数据。

谢谢!

英文:

I have data that I would like to compare in a grouped boxplot, meaning comparing the before/after response to each treatment. The issue is my trial number for each type of treatment is different so I cannot create a dataframe (I am getting an error in the dataframe)

QXpre <- c(3,4,2,1,4,5,4,2,8)
QXpost <- c(0,4,0,0,0,7,0,1,6)
lidopre <-c(5,3,4,5,6)
lidopost <- c(0,0,0,1,2)
vehipre <- c(3,3,5,3,4,3,4)
vehipost <- c(4,3,3,12,6,4,10)

DF1D <- data.frame(QXpre, QXpost, lidopre, lidopost, vehipre, vehipost)

To clarify, I would like: within each group to compare the pre and post values, but have each group show up on the same plot so I can compare statistics across groups.

Thank you!

答案1

得分: 3

以下是翻译好的内容:

而不是将所有向量放在一个数据框中,创建一个按处理方式划分的数据框列表。然后使用例如 `tidyr::pivot_longer` 将每个数据框重塑为长格式,通过 `purrr::imap_dfr` 以便于绑定它们的行:

```r
library(tidyverse)

dat <- list(
  QX = data.frame(QXpre, QXpost),
  lido = data.frame(lidopre, lidopost),
  vehi = data.frame(vehipre, vehipost)
) |> 
  purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = "treatment")

head(dat)
#> # A tibble: 6 × 3
#>   treatment name  value
#>   <chr>     <chr> <dbl>
#> 1 QX        pre       3
#> 2 QX        post      0
#> 3 QX        pre       4
#> 4 QX        post      4
#> 5 QX        pre       2
#> 6 QX        post      0

dat$name <- factor(dat$name, levels = c("pre", "post"))

ggplot(dat, aes(treatment, value, fill = name)) +
  geom_boxplot()

[![在这里输入图片描述][1]][1]

  [1]: https://i.stack.imgur.com/lgZVp.png
英文:

Instead of putting all vectors in one dataframe create a list of data frames per treatment. Afterwards reshape each one to long or tidy format using e.g. tidyr::pivot_longer and bind them by rows for which I use purrr::imap_dfr for convenience:

library(tidyverse)

dat &lt;- list(
  QX = data.frame(QXpre, QXpost),
  lido = data.frame(lidopre, lidopost),
  vehi = data.frame(vehipre, vehipost)
) |&gt; 
  purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = &quot;treatment&quot;)

head(dat)
#&gt; # A tibble: 6 &#215; 3
#&gt;   treatment name  value
#&gt;   &lt;chr&gt;     &lt;chr&gt; &lt;dbl&gt;
#&gt; 1 QX        pre       3
#&gt; 2 QX        post      0
#&gt; 3 QX        pre       4
#&gt; 4 QX        post      4
#&gt; 5 QX        pre       2
#&gt; 6 QX        post      0

dat$name &lt;- factor(dat$name, levels = c(&quot;pre&quot;, &quot;post&quot;))

ggplot(dat, aes(treatment, value, fill = name)) +
  geom_boxplot()

创建一个分组箱线图,每个分组列有不同数量的行?

答案2

得分: 1

只是提供另一种解决方案。您可以创建一个命名的向量列表,然后使用stack()创建长格式的数据框。然后,您可以使用strsplit()为您的组和时间点创建两个变量。其余部分与stefans的回答相同。

library(ggplot2)

vector.list = list(
  QXpre = c(3,4,2,1,4,5,4,2,8),
  QXpost = c(0,4,0,0,0,7,0,1,6),
  lidopre =c(5,3,4,5,6),
  lidopost = c(0,0,0,1,2),
  vehipre = c(3,3,5,3,4,3,4),
  vehipost = c(4,3,3,12,6,4,10)
)

df <- stack(vector.list) # 使用stack()创建长格式的数据框
df[, c("group", "time")] <- do.call(rbind, strsplit(as.character(df$ind), "(?<=.)(?=pre|post)", perl = TRUE)) # 使用strsplit()创建两个变量

df$time <- factor(df$time, levels = c("pre", "post")) # 设置pre和post的顺序

ggplot(df, aes(group, values, fill = time)) +
  geom_boxplot()

创建一个分组箱线图,每个分组列有不同数量的行?

创建于2023-02-16,使用reprex包 (v2.0.1)

英文:

Just to offer another solution. You can create a named list of all your vectors and then use stack() to create a data.frame in the long format. Afterwards you can use strsplit() to create two variables for your groups and timepoints. The rest is the same as in stefans answer.

library(ggplot2)

vector.list = list(
  QXpre = c(3,4,2,1,4,5,4,2,8),
  QXpost = c(0,4,0,0,0,7,0,1,6),
  lidopre =c(5,3,4,5,6),
  lidopost = c(0,0,0,1,2),
  vehipre = c(3,3,5,3,4,3,4),
  vehipost = c(4,3,3,12,6,4,10)
)

df &lt;- stack(vector.list) # creates a data.frame in long format
df[, c(&quot;group&quot;, &quot;time&quot;)] &lt;- do.call(rbind, strsplit(as.character(df$ind), &quot;(?&lt;=.)(?=pre|post)&quot;, perl = TRUE)) # splits the names into two variables

df$time &lt;- factor(df$time, levels = c(&quot;pre&quot;, &quot;post&quot;)) # set the order of pre and post

ggplot(df, aes(group, values, fill = time)) +
  geom_boxplot()

创建一个分组箱线图,每个分组列有不同数量的行?

<sup>Created on 2023-02-16 by the reprex package (v2.0.1)</sup>

huangapple
  • 本文由 发表于 2023年2月16日 16:16:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75469455.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定