Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout support to AbstractVectorSimilarityQuery #13285

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kaivalnp
Copy link
Contributor

@kaivalnp kaivalnp commented Apr 9, 2024

Description

Along similar lines of #13202, adding timeout support for AbstractVectorSimilarityQuery which performs similarity-based vector searches

While the graph search happens inside #scorer, it may go over the configured QueryTimeout and we can early terminate it to return whatever partial results are found..

One inherent benefit we have for exact search is that we return a lazy-loading iterator over all vectors, so this is inherently covered by the TimeLimitingBulkScorer (as opposed to exact search of AbstractKnnVectorQuery which manually goes over all vectors to retain the topK during #rewrite)

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Apr 24, 2024
@benwtrent
Copy link
Member

This seems sane to me.

@vigyasharma what do you think?

@benwtrent
Copy link
Member

@kaivalnp could you update CHANGES as well?

@kaivalnp
Copy link
Contributor Author

Thanks @benwtrent! Added an entry now..

@github-actions github-actions bot removed the Stale label Apr 25, 2024
# Conflicts:
#	lucene/core/src/java/org/apache/lucene/search/ByteVectorSimilarityQuery.java
#	lucene/core/src/java/org/apache/lucene/search/FloatVectorSimilarityQuery.java
@kaivalnp
Copy link
Contributor Author

kaivalnp commented May 9, 2024

Saw some merge conflicts after a recent commit and resolved those..

@kaivalnp
Copy link
Contributor Author

Hi @benwtrent @vigyasharma could you help review this? Thanks!

// Return a lazy-loading iterator
return VectorSimilarityScorer.fromAcceptDocs(
this,
boost,
createVectorScorer(context),
new BitSetIterator(acceptDocs, cardinality),
resultSimilarity);
} else if (results.scoreDocs.length == 0) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't return null any more whenm there are 0 results?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh never mind I see this got moved to VectorSimilarityScorer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was common in a couple of places so I moved it there to reduce repetition

@@ -105,13 +116,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
LeafReader leafReader = context.reader();
Bits liveDocs = leafReader.getLiveDocs();

QueryTimeout queryTimeout = searcher.getTimeout();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm what if there is no timeout? will queryTimeout be null? In that case do we still want to create a TimeLimitingKnnCollectorManager?

Copy link
Contributor Author

@kaivalnp kaivalnp May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will queryTimeout be null?

Yes, this is null when a timeout isn't set

In this case, the TimeLimitingKnnCollectorManager returns an unwrapped KnnCollector which does not add overhead of time checking (even null checks) during graph search (also visible in benchmarks)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants