在ggplot2中,对于因子数据,计算的误差条不会绘制。

huangapple go评论64阅读模式
英文:

Computed error bars do not plot for factor data in ggplot2 R

问题

I am trying to plot error bars for factor data as shown in this reproducible example here. This example just runs fine. However, I tried the same thing with my original dataset, but error bars refuse to plot. I am a little lost. The following are the relevant pieces of code and as reproducible as possible. Any idea, why I cannot get the error bars to work?

Dataframe df has two columns: yaxis.og and species (4 species) like this:

 yaxis.og species
    <dbl> <fct>  
1   2.56   A      
2   3.81   B      
3   1.48   C      
4   1.65   D      
5  -0.177  A      
   .....

I computed a simple linear model and stored the results as such:

summary_df <- summary(mod)$coefficients %>%
  as_tibble() %>%
  dplyr::mutate(species = c("A","B","C","D")) %>%
  dplyr::rename(yaxis = Estimate) %>%
  mutate(ymin = yaxis - 1.96 * `Std. Error`,
         ymax = yaxis + 1.96 * `Std. Error`)

When I plot it without error bars, it plots fine:

p <- ggplot(aes(x = species, y = yaxis.og, color = species, group = species), data = df) + 
geom_jitter(width = 0.1)  +
geom_point(aes(x = species, y = yaxis), color = 'black', size = 2, data = summary_df)

When I add error bars in, it throws a huge error. The error doesn't make too much sense either because the last layer uses a different dataset, so why would it need yaxis.og?

p + geom_errorbar(aes(x = species, ymin = ymin, ymax = ymax), data = summary_df)

Error in geom_errorbar():
! Problem while computing aesthetics.
ℹ Error occurred in the 3rd layer.
Caused by error in FUN():
! object 'yaxis.og' not found
Backtrace:

  1. base (local) `(x)
  2. ggplot2:::print.ggplot(x)
  3. ggplot2:::ggplot_build.ggplot(x)
  4. ggplot2:::by_layer(...)
  5. ggplot2 (local) f(l = layers[[i]], d = data[[i]])
  6. l$compute_aesthetics(d, plot)
  7. ggplot2 (local) compute_aesthetics(..., self = self)
  8. base::lapply(aesthetics, eval_tidy, data = data, env = env)
  9. rlang (local) FUN(X[[i]], ...)
英文:

I am trying to plot error bars for factor data as shown in this reproducible example here. This example just runs fine. However, I tried the same thing with my original dataset, but errorbars refuse to plot. I am a little lost. The following are the relevant pieces of code and as reproducible as possible. Any idea, why I a cannot get the errorr bars to work?

Dataframe df has two columns: yaxis.og and species (4 species) like this:

 yaxis.og species
      &lt;dbl&gt; &lt;fct&gt;  
 1   2.56   A      
 2   3.81   B      
 3   1.48   C      
 4   1.65   D      
 5  -0.177  A  
      .....

I computed a simple linear model and stored the results as such:

summary_df &lt;- summary(mod)$coefficients %&gt;% 
  as_tibble() %&gt;% 
  dplyr::mutate(species = c(&quot;A&quot;,&quot;B&quot;,&quot;C&quot;,&quot;D&quot;)) %&gt;%
  dplyr::rename(yaxis = Estimate) %&gt;%
  mutate(ymin = yaxis - 1.96 * `Std. Error`,
         ymax = yaxis + 1.96 * `Std. Error`)


yaxis `Std. Error`    df `t value` `Pr(&gt;|t|)` species  ymin  ymax
  &lt;dbl&gt;        &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;      &lt;dbl&gt; &lt;chr&gt;   &lt;dbl&gt; &lt;dbl&gt;
1 0.611        0.145  61.2      4.22   8.21e- 5 A       0.327 0.895
2 2.16         0.160  78.0     13.5    3.93e-22 B       1.85  2.47 
3 1.21         0.153  68.7      7.92   2.84e-11 C       0.912 1.51 
4 2.21         0.223 162.       9.89   2.43e-18 D       1.77  2.64 

When I plot it without erorrbars, it plots fine:

p &lt;- ggplot(aes(x = species, y = yaxis.og, color = species, group = species), data = df) + 
geom_jitter(width = 0.1)  +
geom_point(aes(x = species, y = yaxis), color = &#39;black&#39;, size = 2, data = summary_df) 

在ggplot2中,对于因子数据,计算的误差条不会绘制。

When I add errorbars in, it throws a huge error. The error doesn't make to much sense either because the last layer uses a different dataset, so why would it need yaxis.og?

p + geom_errorbar(aes(x = species, ymin = ymin, ymax = ymax), data = summary_df)

Error in `geom_errorbar()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 3rd layer.
Caused by error in `FUN()`:
! object &#39;yaxis.og&#39; not found
Backtrace:
  1. base (local) `&lt;fn&gt;`(x)
  2. ggplot2:::print.ggplot(x)
  4. ggplot2:::ggplot_build.ggplot(x)
  5. ggplot2:::by_layer(...)
 12. ggplot2 (local) f(l = layers[[i]], d = data[[i]])
 13. l$compute_aesthetics(d, plot)
 14. ggplot2 (local) compute_aesthetics(..., self = self)
 15. base::lapply(aesthetics, eval_tidy, data = data, env = env)
 16. rlang (local) FUN(X[[i]], ...)

答案1

得分: 1

问题在于ggplot()中定义的所有aes都是全局美学,即使没有使用,也会被所有geom图层继承。因此,ggplot2期望每个geom图层使用的数据中应该有一个名为yaxis.og的列,除非你像在geom_point中那样覆盖全局aes。

解决这个问题的一种方法是在geom_errorbar中添加inherit.aes=FALSE,这会阻止继承全局aes,或者通过使用aes(..., y = NULL)来覆盖全局aes。

使用基于iris的一些示例数据:

library(ggplot2)

p +
  geom_errorbar(
    aes(x = species, ymin = ymin, ymax = ymax, y = NULL),
    data = summary_df
  )

在ggplot2中,对于因子数据,计算的误差条不会绘制。


p +
  geom_errorbar(
    aes(x = species, ymin = ymin, ymax = ymax),
    data = summary_df, inherit.aes = FALSE
  )

在ggplot2中,对于因子数据,计算的误差条不会绘制。

数据

set.seed(123)

df <- data.frame(
  yaxis.og = c(iris$Sepal.Length, sample(iris$Sepal.Length, 50)),
  species = rep(c("A", "B", "C", "D"), each = 50)
)

mod <- lm(yaxis.og ~ species - 1, data = df)
英文:

The issue is that all aes defined in ggplot() are global aesthetics and will inherited by all geom layers even if not used. As a result ggplot2 expects a column with name yaxis.og to be present in the data used by each geom layer, except in case you override the global aes as you do e.g. in geom_point.

One option to fix that would be add inherit.aes=FALSE to geom_errorbar which prevents the global aes to be inherited or to override the global aes by using aes(..., y = NULL).

Using some fake example data based on iris:

library(ggplot2)

p +
  geom_errorbar(
    aes(x = species, ymin = ymin, ymax = ymax, y = NULL),
    data = summary_df
  )

在ggplot2中,对于因子数据,计算的误差条不会绘制。<!-- -->


p +
  geom_errorbar(
    aes(x = species, ymin = ymin, ymax = ymax),
    data = summary_df, inherit.aes = FALSE
  )

在ggplot2中,对于因子数据,计算的误差条不会绘制。<!-- -->

DATA

set.seed(123)

df &lt;- data.frame(
  yaxis.og = c(iris$Sepal.Length, sample(iris$Sepal.Length, 50)),
  species = rep(c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;), each = 50)
)

mod &lt;- lm(yaxis.og ~ species - 1, data = df)

huangapple
  • 本文由 发表于 2023年5月8日 00:22:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76195035.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定