-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECMClassifier
returns almost all candidate pairs
#193
Comments
I think the threshold for binarizing is too low and you are thus converting all the feature vectors to 1 and getting all matches. Try increasing the binarize threshold |
Thank you for the suggestion, unfortunately, it does not seem to not make any significant difference. I tried lowering and increasing the threshold. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I attempted to replicate my problem in the code snippet above. There are 5966 candidate pairs and my ECM classifier returns 5836 of them as matches.
Problem: I want to use
ECMClassifier
for Entity matching. However, when I apply it to my dataset, ALL the candidate pairs are identified as matches, which is unfortunate.Is there some parameter I can set to tweak the threshold for match vs non-match, or am I missing something else here?
The text was updated successfully, but these errors were encountered: