Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very slow, Doesn't work with large datasets #89

Open
adminy opened this issue Jan 7, 2022 · 6 comments
Open

very slow, Doesn't work with large datasets #89

adminy opened this issue Jan 7, 2022 · 6 comments

Comments

@adminy
Copy link

adminy commented Jan 7, 2022

library(eulerr)


fit <- euler(c( "A" = 55, "B" = 810, "C" = 102, "D" = 364, "E" = 101, "F" = 24,
            "G" = 34, "H" = 61, "I" = 194, "J" = 107, "K" = 53, "L" = 75, "M" = 11,
            "N" = 65, "O" = 16, "P" = 82, "Q" = 13, "F&O" = 5, "G&O" = 5, "F&G" = 5,
            "D&I" = 47, "D&E" = 33, "K&L" = 9, "A&K" = 7, "K&N" = 7, "A&N" = 7, "A&L" = 7,
            "K&P" = 7, "A&P" = 7, "L&N" = 7, "N&P" = 7, "C&K" = 7, "A&C" = 7, "L&P" = 7,
            "E&G" = 6, "J&K" = 7, "A&J" = 7, "C&O" = 5, "G&H" = 4, "C&N" = 7, "J&N" = 7,
            "E&F" = 5, "C&F" = 5, "C&L" = 7, "J&L" = 7, "C&P" = 7, "J&P" = 7, "C&G" = 5,
            "D&K" = 15, "C&J" = 7, "A&D" = 12, "D&L" = 12, "E&O" = 3, "E&H" = 4, "C&I" = 6,
            "C&D" = 8, "D&H" = 7, "D&N" = 7, "D&P" = 7, "D&J" = 7, "C&E" = 3, "H&O" = 1,
            "B&D" = 15, "B&C" = 11, "F&H" = 1, "E&M" = 1, "B&E" = 8, "B&K" = 7, "I&K" = 2,
            "A&B" = 7, "B&N" = 7, "B&L" = 7, "B&P" = 7, "D&F" = 3, "B&J" = 7, "I&L" = 2,
            "C&H" = 1, "D&O" = 1, "D&G" = 1),
        shape = "ellipse")

# ^^^^^^^ takes forever
fit$stress
fit$diagError

plot(fit)

Meanwhile the browser Venn.js is instant, just less accurate.

Can there be some technique applied to speed it up or at least specify a stopping point for accuracy?

Thanks

@jolars
Copy link
Owner

jolars commented Jan 7, 2022

Hm, yes I agree that there should be an option to set tolerance and maximum number of iterations.

It looks like an easy fix too. I'm happy to consider pull requests if you or anyone else wants to contribute.

@adminy
Copy link
Author

adminy commented Jan 7, 2022

I was going to parametrise it but I noticed my N is 131071.
its stuck at getting stats:

stats::nlm(f = optim_final_loss,
                                 p = pars,
                                 areas = areas_disjoint,
                                 circle = circle,
                                 iterlim = 1e6)$estimate

Even with itterlim to 1 its still taking hours.

Either the algorithm is slow or there is a bug.
Also I don't see how my N is so large with just 28 nodes.

@jolars
Copy link
Owner

jolars commented Jan 7, 2022

28 is not a small number, and I'm not actually sure that it makes sense to even try to use an Euler diagram here since the output is bound to be widely misleading. Finding good approximate Euler diagrams is hard even at 4 sets.

The problem is that eulerr tries to compute the areas of the 2^28 - 1 = 268435455 possible intersections in the diagram, which is obviously not going to end well.

It sounds like venn.js is doing something differently since it's as you say much faster, but I'm not really sure how it's possible to avoid considering all the intersections.

@jolars
Copy link
Owner

jolars commented Jan 7, 2022

Or, hm, actually it should absolutely be possible to alleviate this problem by ruling out some of the possible intersections, i.e. if A is not intersecting with C, then obviously A&C&whatever will also be empty, so yeah, this should be possible to fix (at least partially).

@adminy
Copy link
Author

adminy commented Jan 7, 2022

Agreed. Its not the amount of nodes that matter, its the overlaps.

I was thinking to do something like start off with just one overlap and keep increasing the overlaps until I find the outlier overlap that's causing the algorithm to stress over finding a solution.

@csijcs
Copy link

csijcs commented Apr 6, 2022

I'm having a similar issue, any solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants