Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C-minHash variant #203

Open
icsa opened this issue Mar 19, 2023 · 11 comments
Open

Add C-minHash variant #203

icsa opened this issue Mar 19, 2023 · 11 comments

Comments

@icsa
Copy link

icsa commented Mar 19, 2023

C-minHash (or Circulant minHash) needs only two permutations versus K permutations, in practice.
This minHash variant could save significant storage space for large-scale data.

source: https://arxiv.org/abs/2109.03337

@ekzhu
Copy link
Owner

ekzhu commented Mar 19, 2023

This is interesting! From my quick read of the abstract it seems improvement is in reducing the number of permutation functions, but not really reducing the number of hash values stored? Am I missing something?

@icsa
Copy link
Author

icsa commented Mar 19, 2023 via email

@icsa
Copy link
Author

icsa commented Mar 19, 2023 via email

@ekzhu
Copy link
Owner

ekzhu commented Mar 20, 2023 via email

@icsa
Copy link
Author

icsa commented Mar 20, 2023 via email

@ekzhu
Copy link
Owner

ekzhu commented Mar 20, 2023 via email

@ekzhu
Copy link
Owner

ekzhu commented Mar 20, 2023 via email

@icsa
Copy link
Author

icsa commented Mar 20, 2023 via email

@ekzhu
Copy link
Owner

ekzhu commented Mar 20, 2023 via email

@icsa
Copy link
Author

icsa commented Mar 20, 2023 via email

@ekzhu
Copy link
Owner

ekzhu commented Mar 21, 2023

Looking forward to see what you build!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants