Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Jina Reranker #974

Open
langchain4j opened this issue Apr 18, 2024 · 10 comments
Open

[FEATURE] Jina Reranker #974

langchain4j opened this issue Apr 18, 2024 · 10 comments
Labels
enhancement New feature or request good first issue Good for newcomers P2 High priority

Comments

@langchain4j
Copy link
Owner

Implement integration with Jina Reranker as a ScoringModel.

@langchain4j langchain4j added enhancement New feature or request good first issue Good for newcomers P3 Medium priority P2 High priority and removed P3 Medium priority labels Apr 18, 2024
@One16-KS
Copy link

One16-KS commented May 3, 2024

Hello, I'm comparing Jina and Cohere Reranking for a small PoC... It might be a good idea to create a langchain4j-jina module consistent with Cohere? If that is ok, I might help on this issue...

@langchain4j
Copy link
Owner Author

Hi @One16-KS, that would be awesome! Please check #997, maybe you could use the same code for module setup, to avoid a lot of conflicts.

One16-KS pushed a commit to One16-KS/langchain4j that referenced this issue May 3, 2024
Module added according langchain4j#973 in order to minimize conflicts.
Implementation in line with cohere reranking
@One16-KS
Copy link

One16-KS commented May 3, 2024

Hey @langchain4j
Will do! WIP as we speak.... cheers
Feel free to have look at the draft PR

@One16-KS
Copy link

One16-KS commented May 3, 2024

Hey @langchain4j ,
Some questions/remarks.. the Jina API for reranking has 2 extra request params that might be of interest to the user:

  • topN - The number of most relevant document to return (defaults to all of them)
  • returnDocuments - indicator to return with or without the text. (default to true with the text)

The scoring model interface doesn't allow to pass these params yet, but it could be an option as I think of it. I'm happy with the defaults for now, but feel free to give your opinion on the matter.
The result list returned by Jina is ordered already, but I decided to order it anyway to be safe and make it less API-conformist.

Happy to make adjustments according your feedback.
Cheers

@langchain4j
Copy link
Owner Author

Hi @One16-KS I am not sure those parameters are very interesting since we can do that logic on our side easily.
I guess we can set returnDocuments=false as we already have content of the documents (unless they split documents further? is this the case?)

@One16-KS
Copy link

One16-KS commented May 4, 2024

Hello @langchain4j , Did some investigation and they do not split the documents further. To be honest I don't see the point of the param myself as the only goal would be the decrease the payload in size?
I will ignore the extra params for now.
I'll update my PR, feel free to look at it when you have the time.
cheers

@langchain4j
Copy link
Owner Author

@One16-KS exactly, decresing the response size, if it is easy to always set that parameter to false , I would do it.

One16-KS pushed a commit to One16-KS/langchain4j that referenced this issue May 6, 2024
@One16-KS
Copy link

One16-KS commented May 6, 2024

@langchain4j
gotcha, will do so
PR ready for review, just need some guidance on the documentation, as I cannot find any documentation on reranking except in the examples.

@langchain4j
Copy link
Owner Author

@One16-KS seems that we are missing "Scoring (Reranking) Models" section in https://docs.langchain4j.dev/category/integrations. Feel free to add it if you have capacity. Thank you!

@One16-KS
Copy link

One16-KS commented May 7, 2024

Hey @langchain4j,
Will try to add some doc in my PR!
cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers P2 High priority
Projects
None yet
Development

No branches or pull requests

2 participants