Difference between get_batch_scores and get_scores method #10

soumya-ranjan-sahoo · 2020-08-24T18:39:55Z

Hi Team,

I would need your help here!
To give you a brief overview, I have about 500k documents in my corpus and I have only a set of 7k queries-document pairs, and I want to calculate the BM25 scores for each of these individual pairs. To start with -

I have indexed all the 500k documents
I understand I can use get_scores method to get the bm25 scores for all the 500k documents, which is a 500k vector, and then I can index the vector for each of my query-document indexes, i. For example - For a given query with index i, the score for query-document pair with index i, will be bm25score[i].
But this method takes ages to calculate the scores, and hence I was looking for a way around.
Can the method get_batch_scores, be of any help here. My guess is it would only index the subset of the documents provided to the method and not all 500k documents.

My objective is to index 500k documents, and then given query-document pair, I have to calculate the bm25 scores.

Thanks in advance!

soumya-ranjan-sahoo · 2020-09-16T09:48:51Z

Can someone kindly help me answer this? I want to know how get_batch_scores is different from get_scores?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between get_batch_scores and get_scores method #10

Difference between get_batch_scores and get_scores method #10

soumya-ranjan-sahoo commented Aug 24, 2020 •

edited

soumya-ranjan-sahoo commented Sep 16, 2020

Difference between get_batch_scores and get_scores method #10

Difference between get_batch_scores and get_scores method #10

Comments

soumya-ranjan-sahoo commented Aug 24, 2020 • edited

soumya-ranjan-sahoo commented Sep 16, 2020

soumya-ranjan-sahoo commented Aug 24, 2020 •

edited