在ggplot2中,只标注唯一重复的键值在一个发散的条形图上。

huangapple go评论107阅读模式
英文:

Annotating only unique duplicated key values on a diverging bar chart in ggplot2

问题

我有一个包含2列和40行的数据框(df)。第一列包含重复的键/ID值,第二列包含20个正值,后面跟着20个负值。

因此,我决定使用一个分散的条形图。但每当我绘制图表时,X轴上的文本会显示两次 - 一个集合(例如前20个正值)重叠在另一个集合(例如后20个负值)上。我的解决方案是使用 scale_x_discrete (),部分原因是这种方式看起来更好。

但我仍然需要显示X轴上的文本。我考虑在一组条形图的底部显示它(正值)。就像这样:

在ggplot2中,只标注唯一重复的键值在一个发散的条形图上。
(但注释文本更间隔,适合每个条形图的中心)。

但是,当我尝试按照下面示例代码所示进行操作时,键值(col1)仍然重叠!或者它们看起来就像是粗体... 无论哪种方式,我都无法做到这一点 =//

我该怎么做?

数据

  1. #示例df:
  2. structure(list(col1 = c("A", "B", "C", "D", "E", "F", "G", "H",
  3. "I", "J", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "A",
  4. "B", "C", "D", "E", "F", "G", "H", "I", "J", "A", "B", "C", "D",
  5. "E", "F", "G", "H", "I", "J"), col2 = c(18.5817806317937, 28.1916172143538,
  6. 8.66620996058919, 12.0227236610372, 24.4170182822272, 29.3641960325185,
  7. 28.7800777778029, 23.1192238365766, 15.7798075131141, 2.86982706259005,
  8. 19.6636101899203, 27.5613576434553, 3.76174484286457, 9.56581128691323,
  9. 23.3280192685779, 8.42091225110926, 16.01897605462, 20.6576479838695,
  10. 5.26960676000454, 21.3152553031687, -1, -14.7368421052632, -10.1578947368421,
  11. -2.52631578947368, -13.2105263157895, -25.4210526315789, -5.57894736842105,
  12. -4.05263157894737, -26.9473684210526, -28.4736842105263, -22.3684210526316,
  13. -7.10526315789474, -19.3157894736842, -23.8947368421053, -17.7894736842105,
  14. -30, -11.6842105263158, -8.63157894736842, -20.8421052631579,
  15. -16.2631578947368)), class = "data.frame", row.names = c(NA,
  16. -40L))
  17. #示例图:
  18. ggplot(df, aes(x = reorder(col1, col2), y = col2)) +
  19. geom_bar(stat = "identity", show.legend = FALSE) +
  20. geom_text(aes(x = 5, y = 0.07, label = paste(col1, collapse = " "), family = "Futura"), color = "black", size = 5) +
  21. xlab("Group") +
  22. ylab("Value") +
  23. theme(axis.text.x = element_blank(), axis.ticks.x = element_blank())
  1. <details>
  2. <summary>英文:</summary>
  3. Say I have a dataframe (df) with a total of 2 columns and 40 rows. The first column have duplicated key/ID values and the second contains 20 positive values, followed by 20 negative ones.
  4. Because of this, I decided to go for a diverging bar chart. But whenever I plotted the chart, the X-axis text was being displayed twice — like, with one set (e.g. the first 20 positive values) overlapping the other (e.g. the last 20 negative values). My solution was to use ```scale_x_discrete ()```; in part because it looked way better this way, too.
  5. But I still needed to show the X-axis text. I thought about displaying it at the base of one set of bars (the positive ones). Like this:
  6. [![enter image description here][1]][1]
  7. (But with the annotated text more spaced, fitting the center of each bar).
  8. But when I try to do this as shown in my sample code below, the key values (col1) are still being overlapped! Or maybe they just look like they&#39;re in bold... Either way, I can&#39;t get this right =//
  9. What could I do?
  10. DATA
  11. ----

#Sample df:
structure(list(col1 = c("A", "B", "C", "D", "E", "F", "G", "H",
"I", "J", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "A",
"B", "C", "D", "E", "F", "G", "H", "I", "J", "A", "B", "C", "D",
"E", "F", "G", "H", "I", "J"), col2 = c(18.5817806317937, 28.1916172143538,
8.66620996058919, 12.0227236610372, 24.4170182822272, 29.3641960325185,
28.7800777778029, 23.1192238365766, 15.7798075131141, 2.86982706259005,
19.6636101899203, 27.5613576434553, 3.76174484286457, 9.56581128691323,
23.3280192685779, 8.42091225110926, 16.01897605462, 20.6576479838695,
5.26960676000454, 21.3152553031687, -1, -14.7368421052632, -10.1578947368421,
-2.52631578947368, -13.2105263157895, -25.4210526315789, -5.57894736842105,
-4.05263157894737, -26.9473684210526, -28.4736842105263, -22.3684210526316,
-7.10526315789474, -19.3157894736842, -23.8947368421053, -17.7894736842105,
-30, -11.6842105263158, -8.63157894736842, -20.8421052631579,
-16.2631578947368)), class = "data.frame", row.names = c(NA,
-40L))

#Sample plot:
ggplot(df, aes(x = reorder (col1, col2), y = col2)) +
geom_bar(stat = "identity", show.legend = FALSE) +
geom_text (aes (x = 5, y = 0.07, label = paste (col1, collapse = " "), family = "Futura"), color = "black", size = 5) +
xlab("Group") +
ylab("Value") +
theme (axis.text.x = element_blank(), axis.ticks.x = element_blank())

  1. [1]: https://i.stack.imgur.com/GAszI.png
  2. </details>
  3. # 答案1
  4. **得分**: 1
  5. 我觉得在使用ggplot2的不同层次时,如果我们在传递给ggplot之前准备好变量顺序,会更容易操作。在这里,我将`col1`基于`col2`(默认使用中位数值)转换为有序因子。
  6. ```R
  7. library(ggplot2); library(dplyr)
  8. df |&gt;
  9. mutate(col1 = forcats::fct_reorder(col1, col2)) |&gt;
  10. ggplot(aes(x = col1, y = col2)) +
  11. geom_bar(stat = &quot;identity&quot;, show.legend = FALSE) +
  12. geom_text(aes(y = 0.07, label = col1), size = 5,
  13. data = distinct(df, col1)) + # 只需要每个col1一个观察值
  14. xlab(&quot;分组&quot;) +
  15. ylab(&quot;数值&quot;) +
  16. theme(axis.text.x = element_blank(), axis.ticks.x = element_blank())

在ggplot2中,只标注唯一重复的键值在一个发散的条形图上。

  1. <details>
  2. <summary>英文:</summary>
  3. I find it easier to work with different layers in ggplot2 if we prepare the variable order before it gets to ggplot. Here I make `col1` an ordered factor based on `col2` (by default using the median value).
  4. library(ggplot2); library(dplyr)
  5. df |&gt;
  6. mutate(col1 = forcats::fct_reorder(col1, col2)) |&gt;
  7. ggplot(aes(x = col1, y = col2)) +
  8. geom_bar(stat = &quot;identity&quot;, show.legend = FALSE) +
  9. geom_text(aes(y = 0.07, label = col1), size = 5,
  10. data = distinct(df, col1)) + # only need one obs per col1
  11. xlab(&quot;Group&quot;) +
  12. ylab(&quot;Value&quot;) +
  13. theme(axis.text.x = element_blank(), axis.ticks.x = element_blank())
  14. [![enter image description here][1]][1]
  15. [1]: https://i.stack.imgur.com/MmJXv.png
  16. </details>

huangapple
  • 本文由 发表于 2023年6月1日 05:11:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76377329.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定