英文:
R: Reordering geom_errorbarh put into gglot
问题
我有以下的CSV文件:
group;response
stepOne;107
stepOne;946
stepTwo;184
stepTwo;456
...
我正在将它读入数据框中,计算统计数据,并将该统计数据反映在误差条形图上。以下是代码:
# 加载所需的库
library(ggplot2)
library(dplyr)
library(viridis)
# 读取输入的CSV文件
data <- read.csv("~/Documents/r/input_file.csv", sep=";", header=TRUE)
x_axis_name <- names(data)[2]
y_axis_name <- names(data)[1]
title <- sprintf("置信区间:%s", names(data)[1])
# 为每个组计算均值和置信区间
group_stats <- data %>%
group_by(group) %>%
summarise(
mean = mean(response),
lower_ci_90 = mean - qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_90 = mean + qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_95 = mean - qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_95 = mean + qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_99 = mean - qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_99 = mean + qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_999 = mean - qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_999 = mean + qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1))
) %>%
arrange(desc(group))
# 从group_stats中找到绝对最大值
limit_x_left = 0
limit_x_right <- 1.15*max(unlist(group_stats[, sapply(group_stats, is.numeric)]))
# 生成图形
graph <- ggplot(group_stats, aes(y = group[order(group, decreasing=TRUE)], x = mean, color = group)) +
geom_errorbarh(aes(xmin = lower_ci_90, xmax = upper_ci_90), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_95, xmax = upper_ci_95), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_99, xmax = upper_ci_99), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_point(size = 3) +
labs(x = x_axis_name, y = y_axis_name, title = title, color = "图例") +
xlim(limit_x_left, limit_x_right) +
scale_color_manual(values = group_colors)
# 显示图形
print(graph)
无论我如何初始化group_stats数据框,ggplot都会按字母顺序降序排列条形。试图在ggplot内部重新排序不起作用:
y = group[order(group, decreasing=FALSE)]
或
y = group[order(group, decreasing=TRUE)]
被忽略。
但是,如果我使用
y = group[order(mean, decreasing=TRUE)]
值将按均值完美排序。
所以,有没有办法改变误差条的排序方式?我想按字母升序排列它们。
<details>
<summary>英文:</summary>
I have the following CSV file:
```
group;response
stepOne;107
stepOne;946
stepTwo;184
stepTwo;456
...
```
I am reading it into dataframe, calculate the stats and reflect that stats on graph as errorbarsh. The code is the following
```
# Load required libraries
library(ggplot2)
library(dplyr)
library(viridis)
# Read the input CSV file
data <- read.csv("~/Documents/r/input_file.csv", sep=";", header=TRUE)
x_axis_name <- names(data)[2]
y_axis_name <- names(data)[1]
title <- sprintf("Confidence interval for %s", names(data)[1])
# Calculate mean and confidence intervals for each group
group_stats <- data %>%
group_by(group) %>%
summarise(
mean = mean(response),
lower_ci_90 = mean - qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_90 = mean + qt(0.90, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_95 = mean - qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_95 = mean + qt(0.95, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_99 = mean - qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_99 = mean + qt(0.99, df = n() - 1) * (sd(response) / sqrt(n()-1)),
lower_ci_999 = mean - qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1)),
upper_ci_999 = mean + qt(0.999, df = n() - 1) * (sd(response) / sqrt(n()-1))
) %>%
arrange(desc(group))
# Find absolute maximum from group_stats
limit_x_left = 0
limit_x_right <- 1.15*max(unlist(group_stats[, sapply(group_stats, is.numeric)]))
# Generate the graph
graph <- ggplot(group_stats, aes(y = group[order(group, decreasing=TRUE)], x = mean, color = group)) +
geom_errorbarh(aes(xmin = lower_ci_90, xmax = upper_ci_90), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_95, xmax = upper_ci_95), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_errorbarh(aes(xmin = lower_ci_99, xmax = upper_ci_99), height = 0.0, linewidth = 2.5, alpha = 0.25) +
geom_point(size = 3) +
labs(x = x_axis_name, y = y_axis_name, title = title, color = "Legend") +
xlim(limit_x_left, limit_x_right) +
scale_color_manual(values = group_colors)
# Display the graph
print(graph)
```
No matter how I initially arrange group_stats dataframe, ggplot will put bars ordering them alphabetically DESC.
Trying to reorder it inside ggplot leads to nothing:
y = group[order(group, decreasing=FALSE)]
or
y = group[order(group, decreasing=TRUE)]
is being ignored.
[![enter image description here][1]][1]
However if I put
y = group[order(mean, decreasing=TRUE)] the values are being sorted by mean perfectly.
[![enter image description here][2]][2]
So, is there any way to change order of errorbars are being put onto plot? I want to order them alphabetically ASC.
[1]: https://i.stack.imgur.com/nc69Q.png
[2]: https://i.stack.imgur.com/0fM9B.png
</details>
# 答案1
**得分**: 1
根据 @jared_mamrot,使用 y = fct_rev(group) 而不是 y = group[order(group, decreasing=TRUE)] 可以解决这个问题。需要安装 forcats 包:
```install.packages("forcats")```
或者安装 tidyverse 包:
```install.packages("tidyverse")```
然后在代码中添加:
```library(forcats)```
感谢 @jared_mamrot。
<details>
<summary>英文:</summary>
Per @jared_mamrot, y = fct_rev(group) instead of y = group[order(group, decreasing=TRUE)] solves the issue. Need to install forcats:
```install.packages("forcats")```
(or ```install.packages("tidyverse")```)
and add
```library(forcats)```
to the code.
Thanks @jared_mamrot
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论