New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
geom_ribbon() > geom_area() speed difference #5788
Comments
I guess the reason for this is the default position argument, which defaults to "identify" for ribbon and to stack for area, which may make it slow? If I switch to: dat |>
ggplot() +
geom_area(aes(x = x, y = y), position = "identity") I seem to get the same speed. Perhaps the function could check if there are any group or fill/colour aesthetics and if not, change the position argument to "identity"? |
I can reproduce the issue, though the benchmarks are measuring how long it takes to Here are more relevant benchmarks; for default options, library(ggplot2)
dat <- data.frame(
x = 1:1e4,
y = rnorm(1e4) + 5
)
area <- ggplot(dat) + geom_area(aes(x, y))
ribbon <- ggplot(dat) + geom_ribbon(aes(x, ymin = 0, ymax = y))
ragg::agg_png(tempfile(fileext = ".png"))
res <- bench::mark(
area = print(area),
ribbon = print(ribbon),
check = FALSE,
min_iterations = 5
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
print(res)
#> # A tibble: 2 × 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
#> <bch:expr> <bch:tm> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
#> 1 area 2.02s 2.1s 0.469 129MB 16.7 5 178 10.7s
#> 2 ribbon 72.24ms 72.6ms 9.04 24.2MB 3.62 5 2 552.8ms
#> # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list> Both are about 70 milliseconds when stat and position are identity. area <- ggplot(dat) +
geom_area(aes(x, y), stat = "identity", position = "identity")
ribbon <- ggplot(dat) +
geom_ribbon(aes(x, ymin = 0, ymax = y), stat = "identity", position = "identity")
res <- bench::mark(
area = print(area),
ribbon = print(ribbon),
check = FALSE,
min_iterations = 5
)
print(res)
#> # A tibble: 2 × 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
#> <bch:expr> <bch:tm> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
#> 1 area 70.6ms 72.1ms 13.8 20.1MB 2.30 6 1 435ms
#> 2 ribbon 70.2ms 71.4ms 14.0 19.5MB 5.61 5 2 356ms
#> # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list> Created on 2024-03-21 with reprex v2.1.0 |
Nice work dude! Yeah I knew something was wrong in my benchmark, should have realized I could've just forced it to write to file ;-). |
I was recently making some figures in which I wanted a simple
geom_area()
call on something with quite a few points. However, I noticed that my computer was waiting for a long time when calling the function. I had previously plotted something withgeom_ribbon()
instead, and it was almost instant there. After playing around, it seems that by default on my machinegeom_area()
is way slower thangeom_ribbon()
for some reason.See below for an attempt at a reprex. Benchmark results are a bit iffy though.
Created on 2024-03-20 with reprex v2.1.0
The text was updated successfully, but these errors were encountered: