英文:
Creating a grouped boxplot with different numbers of rows for each grouped column?
问题
我有数据想要在分组箱线图中比较,即比较每种治疗的前后反应。问题是每种治疗类型的试验次数不同,所以我无法创建一个数据框架(我在数据框架中遇到错误)。
为了澄清,我想要在每个组内比较前后数值,但让每个组都显示在同一图中,以便我可以跨组比较统计数据。
谢谢!
英文:
I have data that I would like to compare in a grouped boxplot, meaning comparing the before/after response to each treatment. The issue is my trial number for each type of treatment is different so I cannot create a dataframe (I am getting an error in the dataframe)
QXpre <- c(3,4,2,1,4,5,4,2,8)
QXpost <- c(0,4,0,0,0,7,0,1,6)
lidopre <-c(5,3,4,5,6)
lidopost <- c(0,0,0,1,2)
vehipre <- c(3,3,5,3,4,3,4)
vehipost <- c(4,3,3,12,6,4,10)
DF1D <- data.frame(QXpre, QXpost, lidopre, lidopost, vehipre, vehipost)
To clarify, I would like: within each group to compare the pre and post values, but have each group show up on the same plot so I can compare statistics across groups.
Thank you!
答案1
得分: 3
以下是翻译好的内容:
而不是将所有向量放在一个数据框中,创建一个按处理方式划分的数据框列表。然后使用例如 `tidyr::pivot_longer` 将每个数据框重塑为长格式,通过 `purrr::imap_dfr` 以便于绑定它们的行:
```r
library(tidyverse)
dat <- list(
QX = data.frame(QXpre, QXpost),
lido = data.frame(lidopre, lidopost),
vehi = data.frame(vehipre, vehipost)
) |>
purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = "treatment")
head(dat)
#> # A tibble: 6 × 3
#> treatment name value
#> <chr> <chr> <dbl>
#> 1 QX pre 3
#> 2 QX post 0
#> 3 QX pre 4
#> 4 QX post 4
#> 5 QX pre 2
#> 6 QX post 0
dat$name <- factor(dat$name, levels = c("pre", "post"))
ggplot(dat, aes(treatment, value, fill = name)) +
geom_boxplot()
[![在这里输入图片描述][1]][1]
[1]: https://i.stack.imgur.com/lgZVp.png
英文:
Instead of putting all vectors in one dataframe create a list of data frames per treatment. Afterwards reshape each one to long or tidy format using e.g. tidyr::pivot_longer
and bind them by rows for which I use purrr::imap_dfr
for convenience:
library(tidyverse)
dat <- list(
QX = data.frame(QXpre, QXpost),
lido = data.frame(lidopre, lidopost),
vehi = data.frame(vehipre, vehipost)
) |>
purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = "treatment")
head(dat)
#> # A tibble: 6 × 3
#> treatment name value
#> <chr> <chr> <dbl>
#> 1 QX pre 3
#> 2 QX post 0
#> 3 QX pre 4
#> 4 QX post 4
#> 5 QX pre 2
#> 6 QX post 0
dat$name <- factor(dat$name, levels = c("pre", "post"))
ggplot(dat, aes(treatment, value, fill = name)) +
geom_boxplot()
答案2
得分: 1
只是提供另一种解决方案。您可以创建一个命名的向量列表,然后使用stack()
创建长格式的数据框。然后,您可以使用strsplit()
为您的组和时间点创建两个变量。其余部分与stefans的回答相同。
library(ggplot2)
vector.list = list(
QXpre = c(3,4,2,1,4,5,4,2,8),
QXpost = c(0,4,0,0,0,7,0,1,6),
lidopre =c(5,3,4,5,6),
lidopost = c(0,0,0,1,2),
vehipre = c(3,3,5,3,4,3,4),
vehipost = c(4,3,3,12,6,4,10)
)
df <- stack(vector.list) # 使用stack()创建长格式的数据框
df[, c("group", "time")] <- do.call(rbind, strsplit(as.character(df$ind), "(?<=.)(?=pre|post)", perl = TRUE)) # 使用strsplit()创建两个变量
df$time <- factor(df$time, levels = c("pre", "post")) # 设置pre和post的顺序
ggplot(df, aes(group, values, fill = time)) +
geom_boxplot()
创建于2023-02-16,使用reprex包 (v2.0.1)
英文:
Just to offer another solution. You can create a named list of all your vectors and then use stack()
to create a data.frame in the long format. Afterwards you can use strsplit()
to create two variables for your groups and timepoints. The rest is the same as in stefans answer.
library(ggplot2)
vector.list = list(
QXpre = c(3,4,2,1,4,5,4,2,8),
QXpost = c(0,4,0,0,0,7,0,1,6),
lidopre =c(5,3,4,5,6),
lidopost = c(0,0,0,1,2),
vehipre = c(3,3,5,3,4,3,4),
vehipost = c(4,3,3,12,6,4,10)
)
df <- stack(vector.list) # creates a data.frame in long format
df[, c("group", "time")] <- do.call(rbind, strsplit(as.character(df$ind), "(?<=.)(?=pre|post)", perl = TRUE)) # splits the names into two variables
df$time <- factor(df$time, levels = c("pre", "post")) # set the order of pre and post
ggplot(df, aes(group, values, fill = time)) +
geom_boxplot()
<sup>Created on 2023-02-16 by the reprex package (v2.0.1)</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论