Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect presence instead of frequency #37

Open
kripper opened this issue Feb 6, 2024 · 0 comments
Open

Detect presence instead of frequency #37

kripper opened this issue Feb 6, 2024 · 0 comments

Comments

@kripper
Copy link

kripper commented Feb 6, 2024

For our use case (identify certificate types) we want to retrieve docs that contain certain keywords without considering the number of times a keyword is present in a given document. If a keyword repeats many times in the document, it shouldn't have more score than if it only appears once.

For our use case the score should be given by the number of different keywords that appear in the text.
Each keyword apprearence should sum a predefined keyword-score.

It is also desirable that keywords can be formed by single or multiple words separated by spaces (eg: the keyword "certificate of origin" will have a predefined bigger score then the keyword "certificate").

Does this implementation support this use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant