英文:
Big change in computational time if I store the ggplot object before saving the plot
问题
请查看下面的reprex。
即使在一台非常老练的笔记本电脑上,使用ggplot2绘制一个大型彩色散点图并将其保存为png文件,也需要大约一分钟来完成。
让我困惑的是,如果我稍微更改代码的最后部分,并写成:
#绘图
gpl <- ggplot(df3_clip, aes(x, y)) +
## geom_point(aes(color = dist), shape=46, alpha=.01) +
geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
scale_color_gradientn(colors=pulse_pal(500)) +
opt
ggsave(gpl, "pulse.png", dpi=300)
即将ggplot2图形保存为gpl,然后明确保存gpl,那么代码将需要很长时间才能完成(实际上,我甚至不知道它是否完成)。有谁知道为什么会发生这种情况吗?由于我通常使用gpl <- ggplot2(...)编写代码,我想知道是否一直牺牲了性能。
非常感谢!
英文:
Please have a look at the reprex below.
Even on a very seasoned laptop, it takes about one minute to get the job done and save a large colored scatterplot as a png file using ggplot2.
What puzzles me is that if I change slightly the last part of the code and I write
#plot
gpl<-ggplot(df3_clip, aes(x, y)) +
## geom_point(aes(color = dist), shape=46, alpha=.01) +
geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
scale_color_gradientn(colors=pulse_pal(500)) +
opt
ggsave(gpl,"pulse.png", dpi=300
)
i.e. I store the ggplot2 plot as gpl and then I explicitly save gpl, then the code takes forever to complete (actually I do not even know if it does).
Does anyone know why this happens?
Since I usually write my code with gpl<-ggplot2(...) I wonder if I have always given up performance.
Many thanks!
rm(list=ls())
library(Rcpp)
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
library(scattermore)
library(tictoc())
tic()
opt = theme(legend.position = "none",
panel.background = element_rect(fill="black"),
axis.ticks = element_blank(),
panel.grid = element_blank(),
axis.title = element_blank(),
axis.text = element_blank())
## #bedhead
cppFunction('DataFrame createTrajectory(int n, double x0, double y0,
double a, double b) {
// create the columns
NumericVector x(n);
NumericVector y(n);
x[0]=x0;
y[0]=y0;
for(int i = 1; i < n; ++i) {
x[i] = sin(x[i-1]*y[i-1]/b)*y[i-1]+cos(a*x[i-1]-y[i-1]);
y[i] = x[i-1]+sin(y[i-1])/b;
}
// return a new data frame
return DataFrame::create(_["x"]= x, _["y"]= y);
}
')
a=1
b=0.75
df3=createTrajectory(4000000, 1, 1, a, b)
#something new
#color by dist from origin
eu_dist <- function(x1, y1, x2, y2) {
sqrt((x1-x2)^2 + (y1-y2)^2)
}
df3$dist <- map2_dbl(df3$x, df3$y, ~eu_dist(.x, .y, 0, 0))
pulse_pal <- colorRampPalette(c("#FE1BE1", "#A300FF", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal2 <- colorRampPalette(c("#FE1BE1", "#FE1BE1", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal3 <- colorRampPalette(c("#FE1BE1", "#AA1BFE", "#57F7F5", "#57F7F5", "#57F7F5"))
pulse_pal4 <- colorRampPalette(c("#AA1BFE", "#57F7F5", "#57F7F5"))
#clip outer points
xmax <- max(df3$x)/2.5
xmin <- min(df3$x)/2.5
ymax <- max(df3$y)/2.5
ymin <- min(df3$y)/2.5
df3_clip <- df3 |>
filter(x > xmin & x < xmax) |>
filter(y > ymin & y < ymax)
print("df3clip ready")
#> [1] "df3clip ready"
#plot
ggplot(df3_clip, aes(x, y)) +
## geom_point(aes(color = dist), shape=46, alpha=.01) +
geom_scattermore(aes(color = dist), shape=46, alpha=.01) +
scale_color_gradientn(colors=pulse_pal(500)) +
opt
ggsave("pulse.png", dpi=300
)
#> Saving 7 x 5 in image
toc()
#> 63.8 sec elapsed
sessionInfo()
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 11 (bullseye)
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.13.so
#>
#> locale:
#> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
#> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
#> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] tictoc_1.1 scattermore_0.8 purrr_1.0.1 dplyr_1.1.0
#> [5] ggplot2_3.4.0 Rcpp_1.0.9
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.2.2 pillar_1.8.1 highr_0.9 R.methodsS3_1.8.2
#> [5] R.utils_2.12.1 tools_4.2.2 digest_0.6.30 evaluate_0.17
#> [9] lifecycle_1.0.3 tibble_3.1.8 gtable_0.3.1 R.cache_0.16.0
#> [13] pkgconfig_2.0.3 rlang_1.0.6 reprex_2.0.2 cli_3.6.0
#> [17] yaml_2.3.6 xfun_0.34 fastmap_1.1.0 withr_2.5.0
#> [21] styler_1.8.0 stringr_1.5.0 knitr_1.40 systemfonts_1.0.4
#> [25] generics_0.1.3 fs_1.5.2 vctrs_0.5.2 tidyselect_1.2.0
#> [29] grid_4.2.2 glue_1.6.2 R6_2.5.1 textshaping_0.3.6
#> [33] fansi_1.0.4 rmarkdown_2.17 farver_2.1.1 magrittr_2.0.3
#> [37] scales_1.2.1 htmltools_0.5.3 colorspace_2.0-3 ragg_1.2.4
#> [41] labeling_0.4.2 utf8_1.2.3 stringi_1.7.12 munsell_0.5.0
#> [45] R.oo_1.25.0
<sup>Created on 2023-03-03 with reprex v2.0.2</sup>
答案1
得分: 3
你现在是 calling ggsave with arguments by position 而不是 by name;你在第一个位置有 plot;但第一个位置应该是一个文件名...
要确保你的代码运行最简单的方法是明确指定参数:
ggsave(plot = gpl, filename = "pulse.png", dpi = 300)
英文:
You are calling ggsave with arguments by position rather by name; you have the plot in the first position; but the first position would be a file name ...
the easiest thing you can do so that your code runs is to be explicit
ggsave(plot=gpl,filename = "pulse.png", dpi=300)
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论