2023年4月17日 21:43:11go评论101阅读模式

英文:

White gaps/blank space with stacked area graph (ggplot)

问题

我遵循了这里的指导生成了一个针对分类变量内的四个组的堆叠面积图。代码没有错误，但图表中包含白色间隙，如下图所示。我已经检查了那些特定时期，如十月和十一月，这两个月的数据都是完整的，因此不应该显示白色间隙。我也愿意接受同一数据的其他图表建议？

#安装包：

library(tidyverse)
library(stringr)
library(ggplot2)
library(zoo)
library(ggthemes)
library(writexl)
library(viridis)
library(hrbrthemes)
library(textclean)
library(lubridate)

这是一个数据示例：

dput(stackedgraph[1:20,c(1,2,4)])

输出：

structure(list(month_year = structure(c(2011.91666666667, 2011.91666666667, 
2011.91666666667, 2011.91666666667, 2012, 2012, 2012, 2012, 2012.08333333333, 
2012.08333333333, 2012.08333333333, 2012.08333333333, 2012.16666666667, 
2012.16666666667, 2012.16666666667, 2012.25, 2012.25, 2012.25, 
2012.25, 2012.33333333333), class = "yearmon"), directed_to_whom = c("MoE", 
"MoL", "Non-critical", "Private employers", "MoE", "MoL", "Non-critical", 
"Private employers", "MoE", "MoL", "Non-critical", "Private employers", 
"MoE", "Non-critical", "Private employers", "MoE", "MoL", "Non-critical", 
"Private employers", "MoE"), directed_to_whom_percentage = c(0.0923076923076923, 
0.107692307692308, 0.430769230769231, 0.369230769230769, 0.0666666666666667, 
0.0833333333333333, 0.45, 0.4, 0.0606060606060606, 0.121212121212121, 
0.287878787878788, 0.53030303030303, 0.184210526315789, 0.342105263157895, 
0.473684210526316, 0.131578947368421, 0.105263157894737, 0.210526315789474, 
0.552631578947368, 0.108108108108108)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L), groups = structure(list(
    month_year = structure(c(2011.91666666667, 2012, 2012.08333333333, 
    2012.16666666667, 2012.25, 2012.33333333333), class = "yearmon"), 
    .rows = structure(list(1:4, 5:8, 9:12, 13:15, 16:19, 20L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list")))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), .drop = TRUE))

这是创建图表的代码：

ggplot(stackedgraph, aes(x = as.Date(month_year),y = directed_to_whom_percentage)) + 
  geom_area(aes(fill=directed_to_whom,group = directed_to_whom), position='stack') +
  scale_fill_manual(values = c("MoL" = "light green",
                              "MoE" = "red",
                               "Private employers" = "light blue",
                              "Non-critical" = "black")) +
          scale_x_date(date_breaks = "months" , date_labels = "%b-%y") +
  theme_economist_white() +
    theme(plot.title = element_text(size = 5, face = "bold"),
          axis.text.x = element_text(angle =90, vjust = 0.5)) +
  theme(axis.title.x=element_blank(),
                      axis.ticks.x=element_blank()) +
    scale_y_continuous(labels = percent_format(accuracy = 1))

输出：

我非常感谢Andy的建议，我用了一个小改变运行了代码：

stackedgraph <- stackedgraph %>%
  ungroup() %>%
  complete(month_year, directed_to_whom, fill = list(directed_to_whom_percentage = 0))

然后是图表：

ggplot(stackedgraph, aes(x = as.Date(month_year), y = directed_to_whom_percentage)) +
  geom_area(aes(fill = directed_to_whom, group = directed_to_whom),
            position = 'stack') +
  scale_fill_manual(
    values = c(
      "MoL" = "light green",
      "MoE" = "red",
      "Private employers" = "light blue",
      "Non-critical" = "black"
    )
  ) +
  scale_x_date(date_breaks = "months" , date_labels = "%b-%y") +
  theme(
    plot.title = element_text(size = 5, face = "bold"),
    axis.text.x = element_text(angle = 90, vjust = 0.5)
  ) +
  theme(axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  scale_y_continuous(labels = percent_format(accuracy = 1))

输出：

英文:

I have followed the guidance here to generate a stacked area graph for four groups within a categorical variable. The code works without errors but for some reason the graph contains white gaps, as you can see below. And I have checked those specific periods such as Oct and Nov and the data are complete for both months, and thus it should not be displaying white gaps. I am also open to other graph recommendations for the same data?
#install packages:

library(tidyverse)
library(stringr)
library(ggplot2)
library(zoo)
library(ggthemes)
library(writexl)
library(viridis)
library(hrbrthemes)
library(textclean)
library(lubridate)

Here is a data example:

dput(stackedgraph[1:20,c(1,2,4)])

output

structure(list(month_year = structure(c(2011.91666666667, 2011.91666666667, 
2011.91666666667, 2011.91666666667, 2012, 2012, 2012, 2012, 2012.08333333333, 
2012.08333333333, 2012.08333333333, 2012.08333333333, 2012.16666666667, 
2012.16666666667, 2012.16666666667, 2012.25, 2012.25, 2012.25, 
2012.25, 2012.33333333333), class = &quot;yearmon&quot;), directed_to_whom = c(&quot;MoE&quot;, 
&quot;MoL&quot;, &quot;Non-critical&quot;, &quot;Private employers&quot;, &quot;MoE&quot;, &quot;MoL&quot;, &quot;Non-critical&quot;, 
&quot;Private employers&quot;, &quot;MoE&quot;, &quot;MoL&quot;, &quot;Non-critical&quot;, &quot;Private employers&quot;, 
&quot;MoE&quot;, &quot;Non-critical&quot;, &quot;Private employers&quot;, &quot;MoE&quot;, &quot;MoL&quot;, &quot;Non-critical&quot;, 
&quot;Private employers&quot;, &quot;MoE&quot;), directed_to_whom_percentage = c(0.0923076923076923, 
0.107692307692308, 0.430769230769231, 0.369230769230769, 0.0666666666666667, 
0.0833333333333333, 0.45, 0.4, 0.0606060606060606, 0.121212121212121, 
0.287878787878788, 0.53030303030303, 0.184210526315789, 0.342105263157895, 
0.473684210526316, 0.131578947368421, 0.105263157894737, 0.210526315789474, 
0.552631578947368, 0.108108108108108)), class = c(&quot;grouped_df&quot;, 
&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;), row.names = c(NA, -20L), groups = structure(list(
    month_year = structure(c(2011.91666666667, 2012, 2012.08333333333, 
    2012.16666666667, 2012.25, 2012.33333333333), class = &quot;yearmon&quot;), 
    .rows = structure(list(1:4, 5:8, 9:12, 13:15, 16:19, 20L), ptype = integer(0), class = c(&quot;vctrs_list_of&quot;, 
    &quot;vctrs_vctr&quot;, &quot;list&quot;))), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;
), row.names = c(NA, -6L), .drop = TRUE))

Here is the code to create the graph:

ggplot(stackedgraph, aes(x = as.Date(month_year),y = directed_to_whom_percentage)) + 
  geom_area(aes(fill=directed_to_whom,group = directed_to_whom), position=&#39;stack&#39;) +
  scale_fill_manual(values = c(&quot;MoL&quot; = &quot;light green&quot;,
                              &quot;MoE&quot; = &quot;red&quot;,
                               &quot;Private employers&quot; = &quot;light blue&quot;,
                              &quot;Non-critical&quot; = &quot;black&quot;)) +
          scale_x_date(date_breaks = &quot;months&quot; , date_labels = &quot;%b-%y&quot;) +
  theme_economist_white() +
    theme(plot.title = element_text(size = 5, face = &quot;bold&quot;),
          axis.text.x = element_text(angle =90, vjust = 0.5)) +
  theme(axis.title.x=element_blank(),
                      axis.ticks.x=element_blank()) +
    scale_y_continuous(labels = percent_format(accuracy = 1))

output:

I really appreciate the advice from Andy, I ran the code with a small change:

stackedgraph &lt;- stackedgraph %&gt;%
  ungroup() %&gt;% # I used ungroup to avoid this [error][3] which I was receiving.
  complete(month_year, directed_to_whom, fill = list(directed_to_whom_percentage = 0))

Then the graph:

ggplot(stackedgraph, aes(x = as.Date(month_year), y = directed_to_whom_percentage)) +
  geom_area(aes(fill = directed_to_whom, group = directed_to_whom),
            position = &#39;stack&#39;) +
  scale_fill_manual(
    values = c(
      &quot;MoL&quot; = &quot;light green&quot;,
      &quot;MoE&quot; = &quot;red&quot;,
      &quot;Private employers&quot; = &quot;light blue&quot;,
      &quot;Non-critical&quot; = &quot;black&quot;
    )
  ) +
  scale_x_date(date_breaks = &quot;months&quot; , date_labels = &quot;%b-%y&quot;) +
  theme(
    plot.title = element_text(size = 5, face = &quot;bold&quot;),
    axis.text.x = element_text(angle = 90, vjust = 0.5)
  ) +
  theme(axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  scale_y_continuous(labels = percent_format(accuracy = 1))

output:

答案1

得分: 1

理想情况下，您需要在每个类别的月份中都为零值添加一个0的行 - 面积图对缺失的月份进行外推，从而在某些情况下添加超过100％。您可以使用pivot_wider / pivot_longer组合进行快速修复以填充缺失的0：

library(tidyverse)
library(zoo)
# 创建用于测试的示例数据
stackedgraph <- tibble::tribble(
                  ~month_year,   ~directed_to_whom, ~directed_to_whom_percentage,
                   "Dec 2011",               "MoE",           0.0923076923076923,
                   "Dec 2011",               "MoL",            0.107692307692308,
                   "Dec 2011",      "Non-critical",            0.430769230769231,
                   "Dec 2011", "Private employers",            0.369230769230769,
                   "Jan 2012",               "MoE",           0.0666666666666667,
                   "Jan 2012",               "MoL",           0.0833333333333333,
                   "Jan 2012",      "Non-critical",                         0.45,
                   "Jan 2012", "Private employers",                          0.4,
                   "Feb 2012",               "MoE",           0.0606060606060606,
                   "Feb 2012",               "MoL",            0.121212121212121,
                   "Feb 2012",      "Non-critical",            0.287878787878788,
                   "Feb 2012", "Private employers",             0.53030303030303,
                   "Mar 2012",               "MoE",            0.184210526315789,
                   "Mar 2012",      "Non-critical",            0.342105263157895,
                   "Mar 2012", "Private employers",            0.473684210526316,
                   "Apr 2012",               "MoE",            0.131578947368421,
                   "Apr 2012",               "MoL",            0.105263157894737,
                   "Apr 2012",      "Non-critical",            0.210526315789474,
                   "Apr 2012", "Private employers",            0.552631578947368
                  ) %> 
  mutate(month_year = as.yearmon(month_year))
# 在您的数据上运行此代码
stackedgraph |>
  pivot_wider(names_from = directed_to_whom,
              values_from = directed_to_whom_percentage,
              values_fill = 0) |>
  pivot_longer(-month_year, 
               names_to = "directed_to_whom",
               values_to = "directed_to_whom_percentage") |>
  ggplot(aes(x = as.Date(month_year), y = directed_to_whom_percentage)) +
  geom_area(aes(fill = directed_to_whom, group = directed_to_whom),
            position = 'stack') +
  scale_fill_manual(
    values = c(
      "MoL" = "light green",
      "MoE" = "red",
      "Private employers" = "light blue",
      "Non-critical" = "black"
    )
  ) +
  scale_x_date(date_breaks = "months" , date_labels = "%b-%y") +
  theme(
    plot.title = element_text(size = 5, face = "bold"),
    axis.text.x = element_text(angle = 90, vjust = 0.5)
  ) +
  theme(axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  scale_y_continuous(labels = percent_format(accuracy = 1))

白色间隙/空白空间与堆叠面积图（ggplot）

编辑 - 稍微更整洁的解决方案

tidyr::complete函数可以以更自然/明确的方式为您执行这些步骤。这将在数据框中创建每个month_year*directed_to_whom组合，并使用0填充缺失值：

stackedgraph <- stackedgraph %>%
  complete(month_year,
           directed_to_whom,
           fill = list(directed_to_whom_percentage = 0))

这与两个数据透视的效果相同，但是这是更好、更正确的方法。这将提供整理过的、扩展的数据框，应该能够传递给您的ggplot代码，得到正确的结果。

英文:

Ideally you would need a row with a 0 for every category in months when it has a value of zero - the area chart is extrapolating for missing months thereby adding over 100% in some cases. You can do a quick fix with a pivot_wider/pivot_longer combo to fill in missing 0s:

library(tidyverse)
library(zoo)
# Creating sample data to test
stackedgraph &lt;- tibble::tribble(
                  ~month_year,   ~directed_to_whom, ~directed_to_whom_percentage,
                   &quot;Dec 2011&quot;,               &quot;MoE&quot;,           0.0923076923076923,
                   &quot;Dec 2011&quot;,               &quot;MoL&quot;,            0.107692307692308,
                   &quot;Dec 2011&quot;,      &quot;Non-critical&quot;,            0.430769230769231,
                   &quot;Dec 2011&quot;, &quot;Private employers&quot;,            0.369230769230769,
                   &quot;Jan 2012&quot;,               &quot;MoE&quot;,           0.0666666666666667,
                   &quot;Jan 2012&quot;,               &quot;MoL&quot;,           0.0833333333333333,
                   &quot;Jan 2012&quot;,      &quot;Non-critical&quot;,                         0.45,
                   &quot;Jan 2012&quot;, &quot;Private employers&quot;,                          0.4,
                   &quot;Feb 2012&quot;,               &quot;MoE&quot;,           0.0606060606060606,
                   &quot;Feb 2012&quot;,               &quot;MoL&quot;,            0.121212121212121,
                   &quot;Feb 2012&quot;,      &quot;Non-critical&quot;,            0.287878787878788,
                   &quot;Feb 2012&quot;, &quot;Private employers&quot;,             0.53030303030303,
                   &quot;Mar 2012&quot;,               &quot;MoE&quot;,            0.184210526315789,
                   &quot;Mar 2012&quot;,      &quot;Non-critical&quot;,            0.342105263157895,
                   &quot;Mar 2012&quot;, &quot;Private employers&quot;,            0.473684210526316,
                   &quot;Apr 2012&quot;,               &quot;MoE&quot;,            0.131578947368421,
                   &quot;Apr 2012&quot;,               &quot;MoL&quot;,            0.105263157894737,
                   &quot;Apr 2012&quot;,      &quot;Non-critical&quot;,            0.210526315789474,
                   &quot;Apr 2012&quot;, &quot;Private employers&quot;,            0.552631578947368
                  ) |&gt; 
  mutate(month_year = as.yearmon(month_year))
# Run this code on your data
stackedgraph |&gt;
  pivot_wider(names_from = directed_to_whom,
              values_from = directed_to_whom_percentage,
              values_fill = 0) |&gt;
  pivot_longer(-month_year, 
               names_to = &quot;directed_to_whom&quot;,
               values_to = &quot;directed_to_whom_percentage&quot;) |&gt;
  ggplot(aes(x = as.Date(month_year), y = directed_to_whom_percentage)) +
  geom_area(aes(fill = directed_to_whom, group = directed_to_whom),
            position = &#39;stack&#39;) +
  scale_fill_manual(
    values = c(
      &quot;MoL&quot; = &quot;light green&quot;,
      &quot;MoE&quot; = &quot;red&quot;,
      &quot;Private employers&quot; = &quot;light blue&quot;,
      &quot;Non-critical&quot; = &quot;black&quot;
    )
  ) +
  scale_x_date(date_breaks = &quot;months&quot; , date_labels = &quot;%b-%y&quot;) +
  theme(
    plot.title = element_text(size = 5, face = &quot;bold&quot;),
    axis.text.x = element_text(angle = 90, vjust = 0.5)
  ) +
  theme(axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  scale_y_continuous(labels = percent_format(accuracy = 1))

白色间隙/空白空间与堆叠面积图（ggplot）

Edit - a slightly tidier solution

The tidyr::complete function can do the steps for you in a bit more of a natural/explicit way. This will create every month_year*directed_to_whom combo in the dataframe and fill missing values with 0s:

stackedgraph &lt;- stackedgraph %&gt;%
  complete(month_year,
           directed_to_whom,
           fill = list(directed_to_whom_percentage = 0))

This does the same as the two pivots but is the nicer, proper way of doing it. This will then give the tidied, expanded dataframe which should pass to your ggplot code giving the right results.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

白色间隙/空白空间与堆叠面积图（ggplot）

问题

答案1

编辑 - 稍微更整洁的解决方案

Edit - a slightly tidier solution

tidymodels: loss_accuracy 不提供变量重要性结果

Having the error "Error in copy_msts(y, fitted) : x and y should have the same number of observations"

R函数用于使用dplyr的group_by和灵活的组进行总结，包括完全没有分组。

在循环中创建的交互式ggiraph对象在Quarto HTML输出中不显示。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。