Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jina AI Embedding model integration #997

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lucifer-Hell
Copy link

Context

This pr is for integration of jina ai embedding model which is mentioned in the issue 973

Change

  1. Since no jina sdk was available for java hence built client for the same .
  2. Added method of embedding generation for both single input and multiple inputs
  3. Default model used jina-embeddings-v2-base-en

Checklist

Before submitting this PR, please check the following points:

  • I have added unit and integration tests for my change
  • All unit and integration tests in the module I have added/changed are green
  • All unit and integration tests in the core and main modules are green
  • I have added/updated the documentation
  • I have added an example in the examples repo (only for "big" features)
  • I have added my new module in the BOM (only when a new module is added)

Checklist for adding new embedding store integration

  • I have added a {NameOfIntegration}EmbeddingStoreIT that extends from either EmbeddingStoreIT or EmbeddingStoreWithFilteringIT

@langchain4j langchain4j added the P3 Medium priority label Apr 23, 2024
Copy link
Owner

@langchain4j langchain4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucifer-Hell thank you a lot! I have left some comments, please check.

<artifactId>langchain4j-jina-ai</artifactId>

<properties>
<maven.compiler.source>22</maven.compiler.source>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep it compatible with Java 8. More info: https://github.com/langchain4j/langchain4j/blob/main/CONTRIBUTING.md

<version>0.30.0</version>
</parent>

<artifactId>langchain4j-jina-ai</artifactId>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick:

Suggested change
<artifactId>langchain4j-jina-ai</artifactId>
<artifactId>langchain4j-jina</artifactId>

Please rename it everywhere, including package name jinaAi -> jina

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>0.30.0</version>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to specify version explicitly, it will be resolved from parent

<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>0.30.0</version>
<scope>compile</scope>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: no need to specify default scope (same for all deps below)

<dependency>
<groupId>com.squareup.retrofit2</groupId>
<artifactId>retrofit</artifactId>
<version>2.9.0</version>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

<dependency>
<groupId>com.squareup.retrofit2</groupId>
<artifactId>converter-gson</artifactId>
<version>2.9.0</version>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>testcontainers</artifactId>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dep required?

import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.store.embedding.CosineSimilarity;
import org.junit.Test;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use junit5

import static dev.langchain4j.internal.ValidationUtils.ensureNotBlank;

public class JinaAiClient {
private static final Gson GSON = new GsonBuilder()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: it would be great to use Jackson instead, as we are starting to get rid of Gson. More details: #1043

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: it would be great to follow the same package structure as in #1043

@lucifer-Hell
Copy link
Author

thanks for reviewing @langchain4j i will do the changes and re-request for review .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Medium priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants