2023年2月16日 16:16:03go评论92阅读模式

英文:

Creating a grouped boxplot with different numbers of rows for each grouped column?

问题

我有数据想要在分组箱线图中比较，即比较每种治疗的前后反应。问题是每种治疗类型的试验次数不同，所以我无法创建一个数据框架（我在数据框架中遇到错误）。

为了澄清，我想要在每个组内比较前后数值，但让每个组都显示在同一图中，以便我可以跨组比较统计数据。

谢谢！

英文:

I have data that I would like to compare in a grouped boxplot, meaning comparing the before/after response to each treatment. The issue is my trial number for each type of treatment is different so I cannot create a dataframe (I am getting an error in the dataframe)

QXpre &lt;- c(3,4,2,1,4,5,4,2,8)
QXpost &lt;- c(0,4,0,0,0,7,0,1,6)
lidopre &lt;-c(5,3,4,5,6)
lidopost &lt;- c(0,0,0,1,2)
vehipre &lt;- c(3,3,5,3,4,3,4)
vehipost &lt;- c(4,3,3,12,6,4,10)
DF1D &lt;- data.frame(QXpre, QXpost, lidopre, lidopost, vehipre, vehipost)

To clarify, I would like: within each group to compare the pre and post values, but have each group show up on the same plot so I can compare statistics across groups.

Thank you!

答案1

得分: 3

以下是翻译好的内容：

而不是将所有向量放在一个数据框中，创建一个按处理方式划分的数据框列表。然后使用例如 `tidyr::pivot_longer` 将每个数据框重塑为长格式，通过 `purrr::imap_dfr` 以便于绑定它们的行：
```r
library(tidyverse)
dat <- list(
  QX = data.frame(QXpre, QXpost),
  lido = data.frame(lidopre, lidopost),
  vehi = data.frame(vehipre, vehipost)
) |> 
  purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = "treatment")
head(dat)
#> # A tibble: 6 × 3
#>   treatment name  value
#>   <chr>     <chr> <dbl>
#> 1 QX        pre       3
#> 2 QX        post      0
#> 3 QX        pre       4
#> 4 QX        post      4
#> 5 QX        pre       2
#> 6 QX        post      0
dat$name <- factor(dat$name, levels = c("pre", "post"))
ggplot(dat, aes(treatment, value, fill = name)) +
  geom_boxplot()


[![在这里输入图片描述][1]][1]
  [1]: https://i.stack.imgur.com/lgZVp.png

英文:

Instead of putting all vectors in one dataframe create a list of data frames per treatment. Afterwards reshape each one to long or tidy format using e.g. tidyr::pivot_longer and bind them by rows for which I use purrr::imap_dfr for convenience:

library(tidyverse)
dat &lt;- list(
  QX = data.frame(QXpre, QXpost),
  lido = data.frame(lidopre, lidopost),
  vehi = data.frame(vehipre, vehipost)
) |&gt; 
  purrr::imap_dfr(~ tidyr::pivot_longer(.x, everything(), names_prefix = .y), .id = &quot;treatment&quot;)
head(dat)
#&gt; # A tibble: 6 &#215; 3
#&gt;   treatment name  value
#&gt;   &lt;chr&gt;     &lt;chr&gt; &lt;dbl&gt;
#&gt; 1 QX        pre       3
#&gt; 2 QX        post      0
#&gt; 3 QX        pre       4
#&gt; 4 QX        post      4
#&gt; 5 QX        pre       2
#&gt; 6 QX        post      0
dat$name &lt;- factor(dat$name, levels = c(&quot;pre&quot;, &quot;post&quot;))
ggplot(dat, aes(treatment, value, fill = name)) +
  geom_boxplot()

答案2

得分: 1

只是提供另一种解决方案。您可以创建一个命名的向量列表，然后使用stack()创建长格式的数据框。然后，您可以使用strsplit()为您的组和时间点创建两个变量。其余部分与stefans的回答相同。

library(ggplot2)
vector.list = list(
  QXpre = c(3,4,2,1,4,5,4,2,8),
  QXpost = c(0,4,0,0,0,7,0,1,6),
  lidopre =c(5,3,4,5,6),
  lidopost = c(0,0,0,1,2),
  vehipre = c(3,3,5,3,4,3,4),
  vehipost = c(4,3,3,12,6,4,10)
)
df <- stack(vector.list) # 使用stack()创建长格式的数据框
df[, c("group", "time")] <- do.call(rbind, strsplit(as.character(df$ind), "(?<=.)(?=pre|post)", perl = TRUE)) # 使用strsplit()创建两个变量
df$time <- factor(df$time, levels = c("pre", "post")) # 设置pre和post的顺序
ggplot(df, aes(group, values, fill = time)) +
  geom_boxplot()

创建一个分组箱线图，每个分组列有不同数量的行？

^{创建于2023-02-16，使用reprex包 (v2.0.1)}

英文:

Just to offer another solution. You can create a named list of all your vectors and then use stack() to create a data.frame in the long format. Afterwards you can use strsplit() to create two variables for your groups and timepoints. The rest is the same as in stefans answer.

library(ggplot2)
vector.list = list(
  QXpre = c(3,4,2,1,4,5,4,2,8),
  QXpost = c(0,4,0,0,0,7,0,1,6),
  lidopre =c(5,3,4,5,6),
  lidopost = c(0,0,0,1,2),
  vehipre = c(3,3,5,3,4,3,4),
  vehipost = c(4,3,3,12,6,4,10)
)
df &lt;- stack(vector.list) # creates a data.frame in long format
df[, c(&quot;group&quot;, &quot;time&quot;)] &lt;- do.call(rbind, strsplit(as.character(df$ind), &quot;(?&lt;=.)(?=pre|post)&quot;, perl = TRUE)) # splits the names into two variables
df$time &lt;- factor(df$time, levels = c(&quot;pre&quot;, &quot;post&quot;)) # set the order of pre and post
ggplot(df, aes(group, values, fill = time)) +
  geom_boxplot()

创建一个分组箱线图，每个分组列有不同数量的行？

<sup>Created on 2023-02-16 by the reprex package (v2.0.1)</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

创建一个分组箱线图，每个分组列有不同数量的行？

问题

答案1

答案2

如何使用strsplit基于行名称筛选数据框。

用R将NA值替换为所有方向的第一个值。

如何在成对比较图中显示字母？

function for horizontal stack bar with ggplot

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论