英文:
ggplot2 split color histograms according to data: facet_grid
问题
建立在这个问题的基础上:
是否有一种方法可以创建一个直方图网格,其中在任意值上方和下方的柱子有不同的颜色(无重叠的柱子),而不需要引用ggplot()之外的环境?我可以使用单个直方图做到这一点,就像这样(仅用中位数进行说明):
set.seed(123)
value = stats::rnorm(100, mean = 0, sd = 1)
df = data.frame(value)
df %>%
{
ggplot(data = ., aes(x = value, fill = ifelse(value > median(value), "0", "1"))) +
geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
}
是否可以为分面图创建这样的效果,其中每个图根据一个分组变量使用不同的值?例如,这个方法不起作用:
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
{
ggplot(data = ., aes(x = value, fill = above_median)) +
facet_grid(rows = group) +
geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
}
英文:
Building on this question:
Is there a way to create a grid of histograms where the bins are different colors above vs. below arbitrary values (without overlapping bins), without needing to refer to the environment outside of ggplot()? I can do this with a single histogram, like this (using median for illustration purposes):
set.seed(123)
value = stats::rnorm(100, mean = 0, sd = 1)
df = data.frame(value)
df %>%
{
ggplot(data = ., aes(x = value, fill = ifelse(value > median(value), "0", "1"))) +
geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
}
Can this be done for faceted plots, where each plot uses a different value, according to a grouping variable? E.g. this doesn't work:
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
{
ggplot(data = ., aes(x = value, fill = above_median)) +
facet_grid(rows = group) +
geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
}
答案1
得分: 2
以下是代码部分的中文翻译:
一种选项是使用多个 geom_histogram
层来添加直方图,即按组拆分数据,然后使用 lapply
为每个组添加一个 geom_histogram
:
library(dplyr, warn=FALSE)
library(ggplot2)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
{
ggplot(data = ., aes(x = value, fill = above_median)) +
facet_grid(rows = vars(group)) +
lapply(split(., .$group), function(x) {
geom_histogram(data = x, boundary = median(x$value), alpha = 0.5, position = "identity")
}) +
theme(legend.position = "none")
}
#> `stat_bin()` 使用 `bins = 30`。使用 `binwidth` 选择更好的值。
#> `stat_bin()` 使用 `bins = 30`。使用 `binwidth` 选择更好的值。
英文:
One option would be to add you histograms using multiple geom_histogram
layers, i.e. split your data by group, then use lapply
to add a geom_histogram
for each group:
library(dplyr, warn=FALSE)
library(ggplot2)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
{
ggplot(data = ., aes(x = value, fill = above_median)) +
facet_grid(rows = vars(group)) +
lapply(split(., .$group), function(x) {
geom_histogram(data = x, boundary = median(x$value), alpha = 0.5, position = "identity")
}) +
theme(legend.position = "none")
}
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<!-- -->
答案2
得分: 2
这是我解决问题的方式,但@stefan的答案更好(+1)。
library(tidyverse)
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
ungroup() %>%
group_split(group) %>%
map(~{
ggplot(data = .x, aes(x = value, fill = above_median)) +
facet_grid(rows = .x$group) +
geom_histogram(boundary = median(.x$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
})
#> [[1]]
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#>
#> [2]
#> stat_bin()
using bins = 30
. Pick better value with binwidth
.
创建于2023年03月16日,使用 reprex v2.0.2
英文:
This is how I would tackle the problem, but @stefan's answer is better (+1),
library(tidyverse)
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>%
dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
dplyr::group_by(group) %>%
dplyr::mutate(above_median = value > median(value)) %>%
ungroup() %>%
group_split(group) %>%
map(~{
ggplot(data = .x, aes(x = value, fill = above_median)) +
facet_grid(rows = .x$group) +
geom_histogram(boundary = median(.x$value), alpha = 0.5, position = "identity") +
theme(legend.position = "none")
})
#> [[1]]
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<!-- -->
#>
#> [[2]]
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<!-- -->
<sup>Created on 2023-03-16 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论