-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing - performance warning - full index can result in a large number of pairs #187
Comments
@gajghaten I think the package uses import logging
logging.getLogger("recordlinkage").setLevel(logging.ERROR) |
@rohitgarud Thanks! That worked! The following helps too!
|
@gajghaten Glad to help. But I think you should absolutely never use full index and always use blocking or sorted index. Full index will give (n choose 2 i.e. n*(n-1)/2) pairs, which increases quadratically with the number of records and slows down the record linkage or deduplication process significantly |
Using warnings.filterwarnings("ignore") does not disable this warning.
I use this function on a dataframe much bigger than the one in the example above and that results in a bunch of these warnings being displayed on the screen.
Could I please get help in disabling them? There are not many resources online on how to disable these kinds of warnings.
The text was updated successfully, but these errors were encountered: