Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terra takes > 1 sec to load on linux and probably >5 sec on Windows and Mac #1440

Open
Jean-Romain opened this issue Feb 28, 2024 · 3 comments

Comments

@Jean-Romain
Copy link

Jean-Romain commented Feb 28, 2024

Loading terra either using library() or by namespace using terra:: takes more than a second on my machine (linux). Something in between 1.3 and 1.7 seconds. This is huge! And I guess it is much more on Windows and Mac probably close to 5 seconds.

It is very problematic for codes that actually take milliseconds to run. The first run may take 1.5 secs while the second may take 150 ms. On my side the main problem is that the examples of my package documentation, that are supposed to take something like 100 ms actually take 1.5 seconds on first run. This is ok for R CMD check on linux but R CMD check on Windows and Mac is failing because the examples are taking more than 5 seconds. All because I'm reading a small raster with terra.

And other issue is that it is absolutely impossible to debug a c++ code with valgrind if somehow a terra function is involved to make a reproducible example because with valgrind this takes several minutes.

In a fresh session

t0 = Sys.time() ; library(terra) ; Sys.time()- t0
#> terra 1.7.71
#> Time difference of 1.510221 secs
t0 = Sys.time() ; r = terra::rast() ; Sys.time()- t0
#> Time difference of 1.534728 secs

As a comparison dplyr takes 0.004 sec to load , Rcpp 0.002 sec, ggplot 0.03 sec and sf takes 0.3 sec (which is huge)

@dimfalk
Copy link

dimfalk commented Apr 5, 2024

@Jean-Romain Hm, can't claim loading {terra} takes this long on Windows by default, at leased based on median time (although it seems like terra can be an outlier with approx. 3 sec max):

mbm <- microbenchmark::microbenchmark(library(dplyr),
                                      library(Rcpp),
                                      library(ggplot2),
                                      library(sf),
                                      library(terra),
                                      
                                      times = 1000)
  
mbm
#> Unit: microseconds
#>              expr   min    lq      mean median    uq       max neval
#>    library(dplyr) 103.6 105.9  297.9007  108.1 110.4  177550.8  1000
#>     library(Rcpp) 103.5 106.2  120.3530  108.2 110.9    5399.9  1000
#>  library(ggplot2) 103.3 106.4  196.4520  108.3 110.5   81636.3  1000
#>       library(sf) 103.4 106.0  490.5340  108.1 110.0  361355.4  1000
#>    library(terra) 103.8 106.3 3115.3979  108.1 110.2 3002136.3  1000

ggplot2::autoplot(mbm)

sessionInfo()
#> R version 4.3.3 (2024-02-29 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=German_Germany.utf8  LC_CTYPE=German_Germany.utf8   
#> [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C                   
#> [5] LC_TIME=German_Germany.utf8    
#> 
#> time zone: Europe/Berlin
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_3.5.0 terra_1.7-71  Rcpp_1.0.12   dplyr_1.1.4   sf_1.0-16    
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.4          compiler_4.3.3        tidyselect_1.2.1     
#>  [4] reprex_2.1.0          scales_1.3.0          yaml_2.3.8           
#>  [7] fastmap_1.1.1         R6_2.5.1              generics_0.1.3       
#> [10] microbenchmark_1.4.10 classInt_0.4-10       knitr_1.45           
#> [13] tibble_3.2.1          units_0.8-5           munsell_0.5.0        
#> [16] R.cache_0.16.0        DBI_1.2.2             pillar_1.9.0         
#> [19] R.utils_2.12.3        rlang_1.1.3           utf8_1.2.4           
#> [22] xfun_0.43             fs_1.6.3              cli_3.6.2            
#> [25] withr_3.0.0           magrittr_2.0.3        class_7.3-22         
#> [28] digest_0.6.35         grid_4.3.3            rstudioapi_0.16.0    
#> [31] lifecycle_1.0.4       R.methodsS3_1.8.2     R.oo_1.26.0          
#> [34] vctrs_0.6.5           KernSmooth_2.23-22    proxy_0.4-27         
#> [37] evaluate_0.23         glue_1.7.0            farver_2.1.1         
#> [40] styler_1.10.2         codetools_0.2-20      colorspace_2.1-0     
#> [43] fansi_1.0.6           e1071_1.7-14          rmarkdown_2.26       
#> [46] purrr_1.0.2           pkgconfig_2.0.3       tools_4.3.3          
#> [49] htmltools_0.5.8

@Jean-Romain
Copy link
Author

Jean-Romain commented Apr 5, 2024

@dimfalk my test was on linux. I re-ran for dplyr, ggplot and co and I probably made a mistake in my first messsage. The timing is closer to 0.1 sec than 0.001 sec. I probaly made the same error than you, loading the libs one after this other. But if you load ggplot the timing for dplyr becomes 0.0001 sec. Each lib must be benchmarked only once in a fresh session.

Anyway, you have the same issue. You can't microbenchmark this 1000 times. Only the first run is slow. Then the libs are already loaded and next repetitions are almost instantaneous. Only the first run matters, which is likely the "max" one. Like me you have a ten fold difference.

@dimfalk
Copy link

dimfalk commented Apr 5, 2024

@Jean-Romain Oopsie, newbie mistake - my bad! 😏

At least it explains why the distributions are similar to this extent...

I benchmarked the following libs manually a few times, using a fresh session:

# terra (in sec): 
# c(3.5, 3.39, 3.72, 3.56, 3.45, 3.59, 3.47, 3.46, 3.51, 3.42)

# sf (in sec):
# c(0.38, 0.54, 0.48, 0.46, 0.46, 0.48, 0.46, 0.51, 0.50, 0.47)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants