在保存绘图之前存储ggplot对象,计算时间会发生很大的变化。

huangapple go评论62阅读模式
英文:

Big change in computational time if I store the ggplot object before saving the plot

问题

请查看下面的reprex。
即使在一台非常老练的笔记本电脑上,使用ggplot2绘制一个大型彩色散点图并将其保存为png文件,也需要大约一分钟来完成。

让我困惑的是,如果我稍微更改代码的最后部分,并写成:

#绘图
gpl <- ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
    
  scale_color_gradientn(colors=pulse_pal(500)) +
  opt

ggsave(gpl, "pulse.png", dpi=300)

即将ggplot2图形保存为gpl,然后明确保存gpl,那么代码将需要很长时间才能完成(实际上,我甚至不知道它是否完成)。有谁知道为什么会发生这种情况吗?由于我通常使用gpl <- ggplot2(...)编写代码,我想知道是否一直牺牲了性能。

非常感谢!

英文:

Please have a look at the reprex below.
Even on a very seasoned laptop, it takes about one minute to get the job done and save a large colored scatterplot as a png file using ggplot2.

What puzzles me is that if I change slightly the last part of the code and I write

#plot
gpl&lt;-ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
    
  scale_color_gradientn(colors=pulse_pal(500)) +
  opt

ggsave(gpl,&quot;pulse.png&quot;, dpi=300
       )


i.e. I store the ggplot2 plot as gpl and then I explicitly save gpl, then the code takes forever to complete (actually I do not even know if it does).
Does anyone know why this happens?
Since I usually write my code with gpl<-ggplot2(...) I wonder if I have always given up performance.

Many thanks!

rm(list=ls())

library(Rcpp)
library(ggplot2)
library(dplyr)
#&gt; 
#&gt; Attaching package: &#39;dplyr&#39;
#&gt; The following objects are masked from &#39;package:stats&#39;:
#&gt; 
#&gt;     filter, lag
#&gt; The following objects are masked from &#39;package:base&#39;:
#&gt; 
#&gt;     intersect, setdiff, setequal, union
library(purrr)
library(scattermore)
library(tictoc())

tic()

opt = theme(legend.position  = &quot;none&quot;,
            panel.background = element_rect(fill=&quot;black&quot;),
            axis.ticks       = element_blank(),
            panel.grid       = element_blank(),
            axis.title       = element_blank(),
            axis.text        = element_blank())



## #bedhead
cppFunction(&#39;DataFrame createTrajectory(int n, double x0, double y0, 
            double a, double b) {
            // create the columns
            NumericVector x(n);
            NumericVector y(n);
            x[0]=x0;
            y[0]=y0;
            for(int i = 1; i &lt; n; ++i) {
            x[i] = sin(x[i-1]*y[i-1]/b)*y[i-1]+cos(a*x[i-1]-y[i-1]);
            y[i] = x[i-1]+sin(y[i-1])/b;
            }
            // return a new data frame
            return DataFrame::create(_[&quot;x&quot;]= x, _[&quot;y&quot;]= y);
            }
            &#39;)

a=1
b=0.75

df3=createTrajectory(4000000, 1, 1, a, b)

#something new
#color by dist from origin
eu_dist &lt;- function(x1, y1, x2, y2) {
  sqrt((x1-x2)^2 + (y1-y2)^2)
}


df3$dist &lt;- map2_dbl(df3$x, df3$y, ~eu_dist(.x, .y, 0, 0))

pulse_pal &lt;- colorRampPalette(c(&quot;#FE1BE1&quot;, &quot;#A300FF&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;))
pulse_pal2 &lt;- colorRampPalette(c(&quot;#FE1BE1&quot;, &quot;#FE1BE1&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;))
pulse_pal3 &lt;- colorRampPalette(c(&quot;#FE1BE1&quot;, &quot;#AA1BFE&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;))
pulse_pal4 &lt;- colorRampPalette(c(&quot;#AA1BFE&quot;, &quot;#57F7F5&quot;, &quot;#57F7F5&quot;))

#clip outer points
xmax &lt;- max(df3$x)/2.5
xmin &lt;- min(df3$x)/2.5
ymax &lt;- max(df3$y)/2.5
ymin &lt;- min(df3$y)/2.5

df3_clip &lt;- df3 |&gt; 
  filter(x &gt; xmin &amp; x &lt; xmax) |&gt; 
  filter(y &gt; ymin &amp; y &lt; ymax)

print(&quot;df3clip ready&quot;)
#&gt; [1] &quot;df3clip ready&quot;
    
#plot
ggplot(df3_clip, aes(x, y)) + 
    ## geom_point(aes(color = dist), shape=46, alpha=.01) +
    geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
    
  scale_color_gradientn(colors=pulse_pal(500)) +
  opt

在保存绘图之前存储ggplot对象,计算时间会发生很大的变化。


ggsave(&quot;pulse.png&quot;, dpi=300
       )
#&gt; Saving 7 x 5 in image

toc()
#&gt; 63.8 sec elapsed

sessionInfo()
#&gt; R version 4.2.2 (2022-10-31)
#&gt; Platform: x86_64-pc-linux-gnu (64-bit)
#&gt; Running under: Debian GNU/Linux 11 (bullseye)
#&gt; 
#&gt; Matrix products: default
#&gt; BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#&gt; LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.13.so
#&gt; 
#&gt; locale:
#&gt;  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
#&gt;  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
#&gt;  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
#&gt;  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
#&gt;  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#&gt; [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
#&gt; 
#&gt; attached base packages:
#&gt; [1] stats     graphics  grDevices utils     datasets  methods   base     
#&gt; 
#&gt; other attached packages:
#&gt; [1] tictoc_1.1      scattermore_0.8 purrr_1.0.1     dplyr_1.1.0    
#&gt; [5] ggplot2_3.4.0   Rcpp_1.0.9     
#&gt; 
#&gt; loaded via a namespace (and not attached):
#&gt;  [1] compiler_4.2.2    pillar_1.8.1      highr_0.9         R.methodsS3_1.8.2
#&gt;  [5] R.utils_2.12.1    tools_4.2.2       digest_0.6.30     evaluate_0.17    
#&gt;  [9] lifecycle_1.0.3   tibble_3.1.8      gtable_0.3.1      R.cache_0.16.0   
#&gt; [13] pkgconfig_2.0.3   rlang_1.0.6       reprex_2.0.2      cli_3.6.0        
#&gt; [17] yaml_2.3.6        xfun_0.34         fastmap_1.1.0     withr_2.5.0      
#&gt; [21] styler_1.8.0      stringr_1.5.0     knitr_1.40        systemfonts_1.0.4
#&gt; [25] generics_0.1.3    fs_1.5.2          vctrs_0.5.2       tidyselect_1.2.0 
#&gt; [29] grid_4.2.2        glue_1.6.2        R6_2.5.1          textshaping_0.3.6
#&gt; [33] fansi_1.0.4       rmarkdown_2.17    farver_2.1.1      magrittr_2.0.3   
#&gt; [37] scales_1.2.1      htmltools_0.5.3   colorspace_2.0-3  ragg_1.2.4       
#&gt; [41] labeling_0.4.2    utf8_1.2.3        stringi_1.7.12    munsell_0.5.0    
#&gt; [45] R.oo_1.25.0

<sup>Created on 2023-03-03 with reprex v2.0.2</sup>

答案1

得分: 3

你现在是 calling ggsave with arguments by position 而不是 by name;你在第一个位置有 plot;但第一个位置应该是一个文件名...
要确保你的代码运行最简单的方法是明确指定参数:

ggsave(plot = gpl, filename = "pulse.png", dpi = 300)
英文:

You are calling ggsave with arguments by position rather by name; you have the plot in the first position; but the first position would be a file name ...
the easiest thing you can do so that your code runs is to be explicit

ggsave(plot=gpl,filename = &quot;pulse.png&quot;, dpi=300)
</details>

huangapple
  • 本文由 发表于 2023年3月3日 20:37:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/75627168.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定