使用`geom_hex`突出显示特定的十六进制单元并更改突出显示单元的线宽。

huangapple go评论69阅读模式
英文:

Highlight particular hex bins with geom_hex and alter the linewidths of highlighted bins

问题

我之前在这里提过一个问题,问如何突出显示六边形图中的特定区块。
现在我想要调整正在突出显示的区块的线宽。然而,我无法使两个六边形区块对齐。之前的答案建议在aes中设置group = 1会解决问题,但对我不起作用。

以下是显示六边形图和未对齐的叠加六边形区块的一些代码。我希望这两个几何体对齐,以便来自stat_summary_hex的突出显示的六边形区块覆盖geom_hex的六边形区块。

n = 1000

df = data.frame(x = rnorm(n), 
                y = rnorm(n),
                group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))
df$sums[df$group == 1] = runif(sum(df$group == 1), min = 0.5, max = 2)

pp = ggplot(df, aes(x = x, y = y, group = group)) +
  geom_hex() +
  stat_summary_hex(aes(
    z = sums,
    linewidth = after_stat(value),
    group = 1
  ), fun = ~ + sum(.x), col = "gold", fill = NA)  + 
  scale_linewidth(range = c(0.1, 1))
pp
英文:

I previously asked a question here, asking how to highlight particular bins in a hexbin plot.
Now I want to adjust the linewidth of the bins being highlighted. However, I cannot get the two hexbin geoms to align. Previous answers suggested that setting group = 1 in the aes would fix the problem, but this didnt work for me.

Here is some code showing the hexbin plot, and the misaligned overlayed hexbins. I want the two geoms to align so that the highlighted hexbins from stat_summary_hex overlay the geom_hex's

n = 1000

df = data.frame(x = rnorm(n), 
                y = rnorm(n),
                group = sample(0:1, n, prob = c(0.9, 0.1), replace = TRUE))
df$sums[df$group == 1] = runif(sum(df$group == 1), min = 0.5, max = 2)

pp = ggplot(df, aes(x = x, y = y, group = group)) +
  geom_hex() +
  stat_summary_hex(aes(
    z = sums,
    linewidth = after_stat(value),
    group = 1
  ), fun = ~ + sum(.x), col = "gold", fill = NA)  + 
  scale_linewidth(range = c(0.1, 1))
pp

答案1

得分: 2

主要问题是您的 sums 列仅在 0 组中有缺失值。因此,在 stat_summary_hex 中会删除这些观察值(您应该会收到警告),并且最终会得到不同的分组。为了解决这个问题,我重新编码了 0 组的 sums 变量为 -999。在 stat_summary_hex 中,我只计算非负值(也就是组 1)的总和。

另外,对我来说不太清楚为什么要将 group 映射到 group aes,这也会导致分组重叠,即每个组都会被单独处理。因此,我将它删除了,当然也删除了 group=1,因为这没有意义。

注意:由于现在 0 组被分配了一个值为 0,我将线宽范围的最小值设置为 0

library(ggplot2)

set.seed(123)

df$sums[is.na(df$sums)] <- -999

ggplot(df, aes(x = x, y = y)) +
  geom_hex() +
  stat_summary_hex(aes(
    z = sums,
    linewidth = after_stat(value),
  ), fun = ~ sum(.x[.x > 0]), col = "gold", fill = NA) +
  scale_linewidth(range = c(0, 1))

[![enter image description here][1]][1]


<details>
<summary>英文:</summary>

The main issue is that your `sums` column has only missings for the `0` group. Hence, these observations get dropped in `stat_summary_hex` (you should get a warning about that)  and you end up with a different binning. To fix that I recode the `sums` variable for the `0` group as `-999`. In `stat_summary_hex` i compute the sum only for the non-negative aka group 1 values. 

Additionally, to me it was not clear why you map `group` on the `group` aes which also results in overlapping bins, i.e. each group gets treated separately. Hence I dropped it and of course also the `group=1` which does not make sense.

Note: As the 0 group now gets assigned a value of 0 I have set the minimum for the linewidth range to `0`.

library(ggplot2)

set.seed(123)

df$sums[is.na(df$sums)] <- -999

ggplot(df, aes(x = x, y = y)) +
geom_hex() +
stat_summary_hex(aes(
z = sums,
linewidth = after_stat(value),
), fun = ~ sum(.x[.x > 0]), col = "gold", fill = NA) +
scale_linewidth(range = c(0, 1))


[![enter image description here][1]][1]


  [1]: https://i.stack.imgur.com/Bjp7V.jpg

</details>



huangapple
  • 本文由 发表于 2023年7月20日 09:14:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76726087.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定