将两个连续变量映射到ggplot中的矩形高度和宽度。

huangapple go评论92阅读模式
英文:

How to map two continuous variables to the height and width of boxes in ggplot?

问题

我能帮你翻译这个内容。

英文:

I want to create a plot with two continuous variables (v1 and v2) and one categorical variable (ex levels A,B,C,D). The plot should show a matrix of proportions. The categorical variable should be on the x-axis and each column should have two boxes (v1 and v2) representing the proportion of each continuous variable within that category (Within A, v1/(v1+v2) then v2/(v1+v2)). The width of the columns should represent the proportion of the total that is within that category (v1+v2 for A divided by the sum of all v1 and v2)

It should look like a heatmap but with the variable type (v1 or v2) mapped to color and the height and width of the boxes mapped as described above.

Using a stacked bar graph approach worked well and is close to what I want but there is horizontal space between the bars. Since I'm already using the width aesthetic to map the proportion within each category I wasn't able to eliminate this space.

将两个连续变量映射到ggplot中的矩形高度和宽度。

Alternatively I tried to use geom_tile but that suffered from the same space issue and didn't result in all bars with a height of 1.

将两个连续变量映射到ggplot中的矩形高度和宽度。

The closest solution I have found is: https://stackoverflow.com/questions/66996598/ggplot2-heatmap-with-tile-height-and-width-as-aes

However in that example they have a categorical variable on both X and Y axes which is a little different than my case.

Reproducible example for reference:

  1. library(tidyverse)
  2. cat <- c("A","B","C","D")
  3. v1 <- c(1,3,6,2)
  4. v2 <- c(3,3,10,1)
  5. df <- data.frame(cat,v1,v2)
  6. df <- df %>%
  7. group_by(cat) %>%
  8. mutate(sum.cat = sum(v1,v2)) %>%
  9. mutate(prop.v1 = v1/sum.cat) %>%
  10. ungroup() %>%
  11. mutate(prop.cat = sum.cat/sum(v1,v2)) %>%
  12. mutate(sum.tot = sum(sum.cat)) %>%
  13. mutate(prop.v2 = 1-prop.v1) %>%
  14. pivot_longer(cols = c(5,8), names_to = "prop.v.type", values_to = "prop.v")
  15. ggplot(df,aes(cat,prop.v, fill = prop.v.type))+
  16. geom_bar(position = "stack", stat = "identity",aes(width=prop.cat))
  17. ggplot(df,aes(x=cat, y=prop.v, fill = prop.v.type))+
  18. geom_tile(aes(width=prop.cat,height=prop.v))

Thanks in advance!

答案1

得分: 1

  1. 它可以通过对x轴值进行小小的修改来实现。我所做的是根据 prop.cat 计算 x 轴值,然后将 cat 标签分配给与每个柱位置相对应的匹配值。这将使得 x 轴成为连续值,以便 `width` 美学现在可以匹配轴值。
  2. cat <- c("A","B","C","D")
  3. v1 <- c(1,3,6,2)
  4. v2 <- c(3,3,10,1)
  5. df <- data.frame(cat,v1,v2)
  6. df <- df %>%
  7. group_by(cat) %>%
  8. mutate(sum.cat = sum(v1,v2)) %>%
  9. mutate(prop.v1 = v1/sum.cat) %>%
  10. ungroup() %>%
  11. mutate(prop.cat = sum.cat/sum(v1,v2)) %>%
  12. mutate(sum.tot = sum(sum.cat)) %>%
  13. mutate(prop.v2 = 1-prop.v1) %>%
  14. pivot_longer(cols = c(5,8), names_to = "prop.v.type", values_to = "prop.v")
  15. # 这里我计算了每个 cat 的 x 轴位置
  16. df_revised <- df |>
  17. group_by(cat) |>
  18. mutate(prop.cat_cumsum = if_else(row_number() == 1, prop.cat, 0)) |>
  19. ungroup() |>
  20. mutate(prop.cat_cumsum = cumsum(prop.cat_cumsum)) |>
  21. mutate(x_axis_value = 0 + prop.cat_cumsum - prop.cat / 2)
  22. # 因为 cat 和值在顺序上是对齐的,所以我只是将它们提取出来
  23. x_asix_breaks <- unique(df_revised$x_axis_value)
  24. x_asix_labels <- unique(df_revised$cat)
  25. # 现在我绘制它们以测试它们是否匹配得很好。
  26. ggplot(df_revised,
  27. aes(x = x_axis_value, y = prop.v, fill = prop.v.type))+
  28. geom_bar(position = "stack", stat = "identity",
  29. aes(width = prop.cat)) +
  30. scale_x_continuous(breaks = x_asix_breaks, expand = c(0, 0)) +
  31. scale_y_continuous(expand = c(0, 0))
  32. #> 警告 in geom_bar(position = "stack", stat = "identity", aes(width = prop.cat)): Ignoring unknown aesthetics: width

将两个连续变量映射到ggplot中的矩形高度和宽度。

好的,它按预期工作了。现在只需要将正确的 cat 标签分配给 x 轴,并在柱状图上添加一条线框,以便更容易区分柱之间的差异。

  1. ggplot(df_revised,
  2. aes(x = x_axis_value, y = prop.v, fill = prop.v.type))+
  3. geom_bar(position = "stack", stat = "identity",
  4. color = "black", aes(width = prop.cat)) +
  5. scale_x_continuous(breaks = x_asix_breaks, labels = x_asix_labels,
  6. expand = c(0, 0)) +
  7. scale_y_continuous(expand = c(0, 0))
  8. #> 警告 in geom_bar(position = "stack", stat = "identity", color = "black", : Ignoring unknown aesthetics: width

将两个连续变量映射到ggplot中的矩形高度和宽度。

创建于2023-05-18,使用 reprex v2.0.2

  1. <details>
  2. <summary>英文:</summary>
  3. It can be done with a little hack to the x-axis values. What I did is I calculate the x-Axis value based on the prop.cat the assign the cat labels to matched values of each bar position corresponded to each cat. This will make the x-Axis continous values so that the `width` aes now able to matched Axis values.
  4. ``` r
  5. library(tidyverse)
  6. cat &lt;- c(&quot;A&quot;,&quot;B&quot;,&quot;C&quot;,&quot;D&quot;)
  7. v1 &lt;- c(1,3,6,2)
  8. v2 &lt;- c(3,3,10,1)
  9. df &lt;- data.frame(cat,v1,v2)
  10. df &lt;- df %&gt;%
  11. group_by(cat) %&gt;%
  12. mutate(sum.cat = sum(v1,v2)) %&gt;%
  13. mutate(prop.v1 = v1/sum.cat) %&gt;%
  14. ungroup() %&gt;%
  15. mutate(prop.cat = sum.cat/sum(v1,v2)) %&gt;%
  16. mutate(sum.tot = sum(sum.cat)) %&gt;%
  17. mutate(prop.v2 = 1-prop.v1) %&gt;%
  18. pivot_longer(cols = c(5,8), names_to = &quot;prop.v.type&quot;, values_to = &quot;prop.v&quot;)
  19. # Here I calculate the x_axis position for each cat
  20. df_revised &lt;- df |&gt;
  21. group_by(cat) |&gt;
  22. mutate(prop.cat_cumsum = if_else(row_number() == 1, prop.cat, 0)) |&gt;
  23. ungroup() |&gt;
  24. mutate(prop.cat_cumsum = cumsum(prop.cat_cumsum)) |&gt;
  25. mutate(x_axis_value = 0 + prop.cat_cumsum - prop.cat / 2)
  26. # As the cat &amp; the values are well aligned in order so I just extract them
  27. x_asix_breaks &lt;- unique(df_revised$x_axis_value)
  28. x_asix_labels &lt;- unique(df_revised$cat)
  29. # Now I plot them to test if it fit well.
  30. ggplot(df_revised,
  31. aes(x = x_axis_value, y = prop.v, fill = prop.v.type))+
  32. geom_bar(position = &quot;stack&quot;, stat = &quot;identity&quot;,
  33. aes(width = prop.cat)) +
  34. scale_x_continuous(breaks = x_asix_breaks, expand = c(0, 0)) +
  35. scale_y_continuous(expand = c(0, 0))
  36. #&gt; Warning in geom_bar(position = &quot;stack&quot;, stat = &quot;identity&quot;, aes(width =
  37. #&gt; prop.cat)): Ignoring unknown aesthetics: width

将两个连续变量映射到ggplot中的矩形高度和宽度。<!-- -->

Ok it worked as expected. Now just need to assign the proper cat labels to the x-Axis and add a line border to the bar so it easy to distinct between bars.

  1. ggplot(df_revised,
  2. aes(x = x_axis_value, y = prop.v, fill = prop.v.type))+
  3. geom_bar(position = &quot;stack&quot;, stat = &quot;identity&quot;,
  4. color = &quot;black&quot;, aes(width = prop.cat)) +
  5. scale_x_continuous(breaks = x_asix_breaks, labels = x_asix_labels,
  6. expand = c(0, 0)) +
  7. scale_y_continuous(expand = c(0, 0))
  8. #&gt; Warning in geom_bar(position = &quot;stack&quot;, stat = &quot;identity&quot;, color = &quot;black&quot;, :
  9. #&gt; Ignoring unknown aesthetics: width

将两个连续变量映射到ggplot中的矩形高度和宽度。<!-- -->

<sup>Created on 2023-05-18 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年5月18日 08:24:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76276989.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定