2023年3月15日 21:27:14go评论94阅读模式

英文:

ggplot2 split color histograms according to data: facet_grid

问题

建立在这个问题的基础上：

是否有一种方法可以创建一个直方图网格，其中在任意值上方和下方的柱子有不同的颜色（无重叠的柱子），而不需要引用ggplot()之外的环境？我可以使用单个直方图做到这一点，就像这样（仅用中位数进行说明）：

set.seed(123)
value = stats::rnorm(100, mean = 0, sd = 1)
df = data.frame(value)
df %>%
  {
    ggplot(data = ., aes(x = value, fill = ifelse(value > median(value), "0", "1"))) +
      geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
      theme(legend.position = "none")
  }

是否可以为分面图创建这样的效果，其中每个图根据一个分组变量使用不同的值？例如，这个方法不起作用：

set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>%
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
  dplyr::group_by(group) %>%
  dplyr::mutate(above_median = value > median(value)) %>%
  {
    ggplot(data = ., aes(x = value, fill = above_median)) +
      facet_grid(rows = group) +
      geom_histogram(boundary = median(.$value), alpha = 0.5, position = "identity") +
      theme(legend.position = "none")
  }

英文:

Building on this question:

Is there a way to create a grid of histograms where the bins are different colors above vs. below arbitrary values (without overlapping bins), without needing to refer to the environment outside of ggplot()? I can do this with a single histogram, like this (using median for illustration purposes):

set.seed(123)
value = stats::rnorm(100, mean = 0, sd = 1)
df = data.frame(value)
df %&gt;%
  {
    ggplot(data = ., aes(x = value, fill = ifelse(value &gt; median(value), &quot;0&quot;, &quot;1&quot;))) +
      geom_histogram(boundary = median(.$value), alpha = 0.5, position = &quot;identity&quot;) +
      theme(legend.position = &quot;none&quot;)
  }

Can this be done for faceted plots, where each plot uses a different value, according to a grouping variable? E.g. this doesn't work:

set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
    
df = data.frame(value, group)
df %&gt;%
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %&gt;%
  dplyr::group_by(group) %&gt;%
  dplyr::mutate(above_median = value &gt; median(value)) %&gt;%
  {
    ggplot(data = ., aes(x = value, fill = above_median)) +
      facet_grid(rows = group) +
      geom_histogram(boundary = median(.$value), alpha = 0.5, position = &quot;identity&quot;) +
      theme(legend.position = &quot;none&quot;)
  }

答案1

得分: 2

以下是代码部分的中文翻译：

一种选项是使用多个 geom_histogram 层来添加直方图，即按组拆分数据，然后使用 lapply 为每个组添加一个 geom_histogram：

library(dplyr, warn=FALSE)
library(ggplot2)
df %>%
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>%
  dplyr::group_by(group) %>%
  dplyr::mutate(above_median = value > median(value)) %>%
  {
    ggplot(data = ., aes(x = value, fill = above_median)) +
      facet_grid(rows = vars(group)) +
      lapply(split(., .$group), function(x) {
        geom_histogram(data = x, boundary = median(x$value), alpha = 0.5, position = "identity")
      }) +
      theme(legend.position = "none")
  }
#> `stat_bin()` 使用 `bins = 30`。使用 `binwidth` 选择更好的值。
#> `stat_bin()` 使用 `bins = 30`。使用 `binwidth` 选择更好的值。

ggplot2根据数据拆分颜色直方图：facet_grid

英文:

One option would be to add you histograms using multiple geom_histogram layers, i.e. split your data by group, then use lapply to add a geom_histogram for each group:

library(dplyr, warn=FALSE)
library(ggplot2)
df %&gt;%
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %&gt;%
  dplyr::group_by(group) %&gt;%
  dplyr::mutate(above_median = value &gt; median(value)) %&gt;%
  {
    ggplot(data = ., aes(x = value, fill = above_median)) +
      facet_grid(rows = vars(group)) +
      lapply(split(., .$group), function(x) {
        geom_histogram(data = x, boundary = median(x$value), alpha = 0.5, position = &quot;identity&quot;)
      }) +
      theme(legend.position = &quot;none&quot;)
  }
#&gt; `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#&gt; `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot2根据数据拆分颜色直方图：facet_grid

答案2

得分: 2

这是我解决问题的方式，但@stefan的答案更好（+1）。

library(tidyverse)
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %>% 
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %>% 
  dplyr::group_by(group) %>% 
  dplyr::mutate(above_median = value > median(value)) %>% 
  ungroup() %>% 
  group_split(group) %>% 
  map(~{
    ggplot(data = .x, aes(x = value, fill = above_median)) +
      facet_grid(rows = .x$group) +
      geom_histogram(boundary = median(.x$value), alpha = 0.5, position = "identity") +
      theme(legend.position = "none")
  })
#> [[1]]
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot2根据数据拆分颜色直方图：facet_grid

#>
#> [2]
#> stat_bin() using bins = 30. Pick better value with binwidth.

ggplot2根据数据拆分颜色直方图：facet_grid

^{创建于2023年03月16日，使用 reprex v2.0.2}

英文:

This is how I would tackle the problem, but @stefan's answer is better (+1),

library(tidyverse)
set.seed(456)
value = stats::rnorm(200, mean = 0, sd = 1)
group = c(rep(1,100), rep(2,100))
df = data.frame(value, group)
df %&gt;%
  dplyr::mutate(value = ifelse(group == 2, value + 1, value)) %&gt;%
  dplyr::group_by(group) %&gt;%
  dplyr::mutate(above_median = value &gt; median(value)) %&gt;%
  ungroup() %&gt;%
  group_split(group) %&gt;%
  map(~{
    ggplot(data = .x, aes(x = value, fill = above_median)) +
      facet_grid(rows = .x$group) +
      geom_histogram(boundary = median(.x$value), alpha = 0.5, position = &quot;identity&quot;) +
      theme(legend.position = &quot;none&quot;)
  })
#&gt; [[1]]
#&gt; `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot2根据数据拆分颜色直方图：facet_grid

#&gt; 
#&gt; [[2]]
#&gt; `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot2根据数据拆分颜色直方图：facet_grid

<sup>Created on 2023-03-16 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

ggplot2根据数据拆分颜色直方图：facet_grid

问题

答案1

答案2

显示 ggplot 直方图上的所有 x 轴标签。

geom_scatterpie带有不缩放的图例

问题出现在对新数据进行评分时 — tidymodels

使用另一个变量更改ggplot的facet标签。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论