Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter by score #4609

Open
8 tasks
curquiza opened this issue Apr 30, 2024 · 1 comment
Open
8 tasks

Filter by score #4609

curquiza opened this issue Apr 30, 2024 · 1 comment
Labels
enhancement New feature or improvement impacts docs This issue involves changes in the Meilisearch's documentation impacts integrations This issue involves changes in the Meilisearch's integrations missing usage in PRD Description of the feature usage is missing in the PRD prototype available You can test this feature using the available prototype
Milestone

Comments

@curquiza
Copy link
Member

curquiza commented Apr 30, 2024

Related product team resources: PRD (internal only)

Motivation

Required by users

Usage

Filter by score

https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=74

TODO

Reminders when modifying the Setting API

  • Ensure the new setting route is at least tested by the test_setting_routes macro
  • Ensure Analytics are fully implemented
  • Ensure the dump serializing is consistent with the /settings route serializing, e.g., enums case can be different (camelCase in route and PascalCase in the dump)

Special cases when adding a setting for an experimental feature

  • ⚠️ API stability: The setting does not appear on the main settings route when the feature has never been enabled (e.g. mark it Unset when returned from the index in this situation. See an example)
  • The setting cannot be set when the feature is disabled, either by the main settings route or the subroute (see validate_settings function)
  • If possible, the setting is reset when the feature is disabled (hard if it requires reindexing)

Impacted teams

@meilisearch/integration-team @meilisearch/docs-team

@curquiza curquiza added impacts docs This issue involves changes in the Meilisearch's documentation impacts integrations This issue involves changes in the Meilisearch's integrations missing usage in PRD Description of the feature usage is missing in the PRD labels Apr 30, 2024
@curquiza curquiza added this to the v1.9.0 milestone Apr 30, 2024
@curquiza curquiza added the enhancement New feature or improvement label Apr 30, 2024
@dureuill
Copy link
Contributor

dureuill commented May 7, 2024

Hello 👋

A prototype is available for filtering by ranking score!

Getting the prototype

You need to start from a fresh new database (remove the previously used data.ms) and use the following Docker image:

docker run -it --rm -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-score-threshold-0

Using the prototype

The prototype adds a new optional parameter to search requests: rankingScoreThreshold.

The rankingScoreThreshold if present must be a number in the [0.0, 1.0] range.

During the search, any set of results whose _rankingScore is below the rankingScoreThreshold is discarded instead of returned. The corresponding documents are also removed from the facet distribution and don’t count towards the totalHits and estimatedTotalHits.

Warning

For performance reasons, the current implementation does not always return correct values for totalHits and estimatedTotalHits

Explanation When the `limit` number of results is reached, then the algorithm stops without computing the `_rankingScore` for the remaining, unreturned documents. As a result, Meilisearch cannot determine if the remaining documents meet the threshold or not. The current behavior is to keep the remaining documents in the set of possible candidates. This means that increasing `limit` or requesting the next page can result in a decrease in `totalHits` and `estimatedTotalHits` (as well as the number of pages). Fixing this limitation involves computing the results exhaustively, which is going to be very performance-intensive and difficult to achieve at all for vector sort.

We're looking for feedback on this prototype, especially on the limitations stated above.

Caution

We do not recommend using this prototype in production. This is only for test purposes.

@curquiza curquiza added the prototype available You can test this feature using the available prototype label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement impacts docs This issue involves changes in the Meilisearch's documentation impacts integrations This issue involves changes in the Meilisearch's integrations missing usage in PRD Description of the feature usage is missing in the PRD prototype available You can test this feature using the available prototype
Projects
None yet
Development

No branches or pull requests

2 participants