Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Azure OpenAI and AI Search #8

Merged
merged 53 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
d5d21bf
Support Azure OpenAI
Feb 18, 2024
1d04edb
Merge branch 'langchain4j:main' into main
showpune Feb 20, 2024
bba8850
Update langchain4j-azure-open-ai-spring-boot-starter/pom.xml
showpune Mar 15, 2024
2b81671
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
9d01db0
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
2f0053a
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
e0de25c
remove the azure core lib
Mar 15, 2024
a423557
Add Azure AI Search support
Mar 15, 2024
436f505
Format the code
Mar 15, 2024
546b0f1
use latest ai search
Mar 20, 2024
8e3be8c
decouple as different resource
Mar 22, 2024
48dcf22
decouple as different resource
Mar 26, 2024
5e5fa49
Merge branch 'langchain4j:main' into main
showpune Mar 26, 2024
26da799
Merge remote-tracking branch 'origin/azure-resource'
Mar 26, 2024
ed5be1b
Revert "bumped to 0.29.0-SNAPSHOT"
Mar 26, 2024
165d27e
rollback the distributionManagement
Mar 26, 2024
bbe199d
Reapply "bumped to 0.29.0-SNAPSHOT"
Mar 26, 2024
f8c6eae
Add Default value support
Mar 26, 2024
99581d0
Rollback the other components
Mar 26, 2024
8552bb7
Support Azure OpenAI
Feb 18, 2024
7f288ea
Update langchain4j-azure-open-ai-spring-boot-starter/pom.xml
showpune Mar 15, 2024
501aadf
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
141634e
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
46d64e8
Update langchain4j-azure-open-ai-spring-boot-starter/src/test/java/de…
showpune Mar 15, 2024
877cf22
remove the azure core lib
Mar 15, 2024
615f561
Add Azure AI Search support
Mar 15, 2024
3177711
Format the code
Mar 15, 2024
f8dc867
use latest ai search
Mar 20, 2024
615a2f9
decouple as different resource
Mar 22, 2024
4078e31
decouple as different resource
Mar 26, 2024
16864e1
Revert "bumped to 0.29.0-SNAPSHOT"
Mar 26, 2024
6296d25
rollback the distributionManagement
Mar 26, 2024
ec6edbc
Reapply "bumped to 0.29.0-SNAPSHOT"
Mar 26, 2024
bf227f2
Add Default value support
Mar 26, 2024
9c475aa
Rollback the other components
Mar 26, 2024
6ce6b67
Merge remote-tracking branch 'origin/main'
Mar 26, 2024
7b9364e
Rollback the other components
Mar 26, 2024
de2978c
Rollback the other components
Mar 26, 2024
118a429
Add Non Azure Support
Mar 27, 2024
43fc526
NonAzureKey not depends on deployment
Mar 28, 2024
f0c5dea
Update langchain4j-azure-aisearch-spring-boot-starter/src/main/java/d…
showpune Mar 29, 2024
0ad06f0
Update langchain4j-azure-openai-spring-boot-starter/pom.xml
showpune Mar 29, 2024
a62be40
Update pom.xml
showpune Mar 29, 2024
08cb971
Update pom.xml
showpune Mar 29, 2024
0e2b24d
Update langchain4j-azure-aisearch-spring-boot-starter/pom.xml
showpune Mar 29, 2024
7dc5bfd
Update langchain4j-azure-aisearch-spring-boot-starter/pom.xml
showpune Mar 29, 2024
019b56d
Update langchain4j-azure-aisearch-spring-boot-starter/pom.xml
showpune Mar 29, 2024
a72d00a
Fix the review problems
Mar 29, 2024
f7873ca
Add function test for retriever and store
Apr 15, 2024
13ab9bb
Update the test case
Apr 16, 2024
cb11013
cosmetics
langchain4j May 2, 2024
79f38be
fixed "Duration.ofSeconds(0) by default"
langchain4j May 2, 2024
489aa41
fixed versions
langchain4j May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions langchain4j-azure-ai-search-spring-boot-starter/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/POM/4.0.0"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring</artifactId>
<version>0.29.0</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>langchain4j-azure-ai-search-spring-boot-starter</artifactId>
<name>LangChain4j Spring Boot starter for Azure AI Search</name>
<packaging>jar</packaging>

<dependencies>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-azure-ai-search</artifactId>
<version>${project.version}</version>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-autoconfigure-processor</artifactId>
<optional>true</optional>
</dependency>

<!-- should be listed before spring-boot-configuration-processor -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<scope>provided</scope>
</dependency>

<!-- needed to generate automatic metadata about available config properties -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-embeddings-all-minilm-l6-v2</artifactId>
<version>0.29.0</version>
<scope>test</scope>
</dependency>

</dependencies>

<licenses>
<license>
<name>Apache-2.0</name>
<url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
<comments>A business-friendly OSS license</comments>
</license>
</licenses>

</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
package dev.langchain4j.azure.aisearch.spring;

import com.azure.search.documents.indexes.models.SearchIndex;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.rag.content.retriever.azure.search.AzureAiSearchContentRetriever;
import dev.langchain4j.store.embedding.azure.search.AzureAiSearchEmbeddingStore;
import org.springframework.boot.autoconfigure.AutoConfiguration;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.lang.Nullable;

import static dev.langchain4j.azure.aisearch.spring.Properties.PREFIX;

@AutoConfiguration
@EnableConfigurationProperties(Properties.class)
public class AutoConfig {
@Bean
@ConditionalOnProperty(PREFIX + ".content-retriever.api-key")
public AzureAiSearchContentRetriever azureAiSearchContentRetriever(Properties properties, @Nullable EmbeddingModel embeddingModel, @Nullable SearchIndex index) {
Properties.NestedProperties nestedProperties = properties.getContentRetriever();
return AzureAiSearchContentRetriever.builder()
.endpoint(nestedProperties.getEndpoint())
.apiKey(nestedProperties.getApiKey())
.createOrUpdateIndex(nestedProperties.getCreateOrUpdateIndex())
.embeddingModel(embeddingModel)
.dimensions(nestedProperties.getDimensions() == null ? 0 : nestedProperties.getDimensions())
.index(index)
.maxResults(nestedProperties.getMaxResults())
.minScore(nestedProperties.getMinScore() == null ? 0.0 : nestedProperties.getMinScore())
.queryType(nestedProperties.getQueryType())
.build();
}

@Bean
@ConditionalOnProperty(PREFIX + ".embedding-store.api-key")
public AzureAiSearchEmbeddingStore azureAiSearchEmbeddingStore(Properties properties, @Nullable EmbeddingModel embeddingModel, @Nullable SearchIndex index) {
Properties.NestedProperties nestedProperties = properties.getEmbeddingStore();
return AzureAiSearchEmbeddingStore.builder()
.endpoint(nestedProperties.getEndpoint())
.apiKey(nestedProperties.getApiKey())
.createOrUpdateIndex(nestedProperties.getCreateOrUpdateIndex())
.dimensions(nestedProperties.getDimensions())
.index(index)
.build();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package dev.langchain4j.azure.aisearch.spring;

import dev.langchain4j.rag.content.retriever.azure.search.AzureAiSearchQueryType;
import lombok.Getter;
import lombok.Setter;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.boot.context.properties.NestedConfigurationProperty;

@Getter
@Setter
@ConfigurationProperties(prefix = Properties.PREFIX)
public class Properties {

static final String PREFIX = "langchain4j.azure.ai-search";

@NestedConfigurationProperty
NestedProperties contentRetriever;

@NestedConfigurationProperty
NestedProperties embeddingStore;

@Getter
@Setter
public static class NestedProperties {
String endpoint;
String apiKey;
Integer dimensions;
Boolean createOrUpdateIndex;
String indexName;
Integer maxResults = 3;
Double minScore;
AzureAiSearchQueryType queryType;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
dev.langchain4j.azure.aisearch.spring.AutoConfig
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dev.langchain4j.azure.aisearch.spring.AutoConfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
package dev.langchain4j.azure.aisearch.spring;

import com.azure.core.credential.AzureKeyCredential;
import com.azure.search.documents.indexes.SearchIndexClient;
import com.azure.search.documents.indexes.SearchIndexClientBuilder;
import com.azure.search.documents.indexes.models.SearchIndex;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.AllMiniLmL6V2EmbeddingModel;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.rag.content.Content;
import dev.langchain4j.rag.content.retriever.ContentRetriever;
import dev.langchain4j.rag.content.retriever.azure.search.AzureAiSearchContentRetriever;
import dev.langchain4j.rag.content.retriever.azure.search.AzureAiSearchQueryType;
import dev.langchain4j.rag.query.Query;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.azure.search.AzureAiSearchEmbeddingStore;
import org.junit.jupiter.api.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.autoconfigure.AutoConfigurations;
import org.springframework.boot.test.context.runner.ApplicationContextRunner;

import java.util.List;

import static dev.langchain4j.store.embedding.azure.search.AbstractAzureAiSearchEmbeddingStore.INDEX_NAME;
import static java.util.Arrays.asList;
import static org.assertj.core.api.Assertions.assertThat;

class AutoConfigIT {

private static final String AZURE_SEARCH_KEY = System.getenv("AZURE_SEARCH_KEY");
private static final String AZURE_SEARCH_ENDPOINT = System.getenv("AZURE_SEARCH_ENDPOINT");
ApplicationContextRunner contextRunner = new ApplicationContextRunner()
.withConfiguration(AutoConfigurations.of(AutoConfig.class));

private static final Logger log = LoggerFactory.getLogger(AutoConfigIT.class);

private final EmbeddingModel embeddingModel = new AllMiniLmL6V2EmbeddingModel();
private final int dimensions = embeddingModel.embed("test").content().vector().length;

private final SearchIndexClient searchIndexClient = new SearchIndexClientBuilder()
.endpoint(System.getenv("AZURE_SEARCH_ENDPOINT"))
.credential(new AzureKeyCredential(System.getenv("AZURE_SEARCH_KEY")))
.buildClient();

private SearchIndex index = new SearchIndex(INDEX_NAME);

@Test
void should_provide_ai_search_retriever() {

searchIndexClient.deleteIndex(INDEX_NAME);

contextRunner
.withPropertyValues(
Properties.PREFIX + ".content-retriever.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".content-retriever.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".content-retriever.dimensions=" + dimensions,
Properties.PREFIX + ".content-retriever.create-or-update-index=" + "true",
Properties.PREFIX + ".content-retriever.query-type=" + "VECTOR"
).withBean(EmbeddingModel.class, () -> embeddingModel)
.run(context -> {
ContentRetriever contentRetriever = context.getBean(ContentRetriever.class);
assertThat(contentRetriever).isInstanceOf(AzureAiSearchContentRetriever.class);
AzureAiSearchContentRetriever azureAiSearchContentRetriever = (AzureAiSearchContentRetriever) contentRetriever;

String content1 = "This book is about politics";
String content2 = "Cats sleeps a lot.";
String content3 = "Sandwiches taste good.";
String content4 = "The house is open";
List<String> contents = asList(content1, content2, content3, content4);

for (String content : contents) {
TextSegment textSegment = TextSegment.from(content);
Embedding embedding = embeddingModel.embed(content).content();
azureAiSearchContentRetriever.add(embedding, textSegment);
}

awaitUntilPersisted();
});

String content = "house";
Query query = Query.from(content);

contextRunner
.withPropertyValues(
Properties.PREFIX + ".content-retriever.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".content-retriever.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".content-retriever.create-or-update-index=" + "false",
Properties.PREFIX + ".content-retriever.max-results=" + "3",
Properties.PREFIX + ".content-retriever.min-score=" + "0.6",
Properties.PREFIX + ".content-retriever.query-type=" + AzureAiSearchQueryType.VECTOR
).withBean(SearchIndex.class, () -> index)
.withBean(EmbeddingModel.class, () -> embeddingModel)
.run(context -> {
ContentRetriever contentRetriever = context.getBean(ContentRetriever.class);
assertThat(contentRetriever).isInstanceOf(AzureAiSearchContentRetriever.class);
AzureAiSearchContentRetriever contentRetrieverWithVector = (AzureAiSearchContentRetriever) contentRetriever;
log.info("Testing Vector Search");
List<Content> relevant = contentRetrieverWithVector.retrieve(query);
assertThat(relevant).hasSizeGreaterThan(0);
assertThat(relevant.get(0).textSegment().text()).isEqualTo("The house is open");
log.info("#1 relevant item: {}", relevant.get(0).textSegment().text());
});

contextRunner
.withPropertyValues(
Properties.PREFIX + ".content-retriever.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".content-retriever.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".content-retriever.create-or-update-index=" + "false",
Properties.PREFIX + ".content-retriever.query-type=" + AzureAiSearchQueryType.FULL_TEXT
)
.run(context -> {
ContentRetriever contentRetriever = context.getBean(ContentRetriever.class);
assertThat(contentRetriever).isInstanceOf(AzureAiSearchContentRetriever.class);
AzureAiSearchContentRetriever contentRetrieverWithFullText = (AzureAiSearchContentRetriever) contentRetriever;
log.info("Testing Full Text Search");
// This uses the same storage as the vector search, so we don't need to add the content again
List<Content> relevant2 = contentRetrieverWithFullText.retrieve(query);
assertThat(relevant2).hasSizeGreaterThan(0);
assertThat(relevant2.get(0).textSegment().text()).isEqualTo("The house is open");
log.info("#1 relevant item: {}", relevant2.get(0).textSegment().text());
});

contextRunner
.withPropertyValues(
Properties.PREFIX + ".content-retriever.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".content-retriever.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".content-retriever.create-or-update-index=" + "false",
Properties.PREFIX + ".content-retriever.query-type=" + AzureAiSearchQueryType.HYBRID
).withBean(SearchIndex.class, () -> index)
.withBean(EmbeddingModel.class, () -> embeddingModel)
.run(context -> {
ContentRetriever contentRetriever = context.getBean(ContentRetriever.class);
assertThat(contentRetriever).isInstanceOf(AzureAiSearchContentRetriever.class);
AzureAiSearchContentRetriever contentRetrieverWithHybrid = (AzureAiSearchContentRetriever) contentRetriever;
log.info("Testing Hybrid Search");
List<Content> relevant3 = contentRetrieverWithHybrid.retrieve(query);
assertThat(relevant3).hasSizeGreaterThan(0);
assertThat(relevant3.get(0).textSegment().text()).isEqualTo("The house is open");
log.info("#1 relevant item: {}", relevant3.get(0).textSegment().text());
});

contextRunner
.withPropertyValues(
Properties.PREFIX + ".content-retriever.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".content-retriever.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".content-retriever.create-or-update-index=" + "false",
Properties.PREFIX + ".content-retriever.max-results=" + "3",
Properties.PREFIX + ".content-retriever.min-score=" + "0.4",
Properties.PREFIX + ".content-retriever.query-type=" + AzureAiSearchQueryType.HYBRID_WITH_RERANKING
).withBean(SearchIndex.class, () -> index)
.withBean(EmbeddingModel.class, () -> embeddingModel)
.run(context -> {
ContentRetriever contentRetriever = context.getBean(ContentRetriever.class);
assertThat(contentRetriever).isInstanceOf(AzureAiSearchContentRetriever.class);
AzureAiSearchContentRetriever contentRetrieverWithHybridAndReranking = (AzureAiSearchContentRetriever) contentRetriever;
log.info("Testing Hybrid Search with Reranking");
List<Content> relevant4 = contentRetrieverWithHybridAndReranking.retrieve(query);
assertThat(relevant4).hasSizeGreaterThan(0);
assertThat(relevant4.get(0).textSegment().text()).isEqualTo("The house is open");
log.info("#1 relevant item: {}", relevant4.get(0).textSegment().text());
});
}

protected void awaitUntilPersisted() {
try {
Thread.sleep(1_000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}


@Test
void should_provide_ai_search_embedding_store() {

searchIndexClient.deleteIndex(INDEX_NAME);

contextRunner
.withPropertyValues(
Properties.PREFIX + ".embedding-store.api-key=" + AZURE_SEARCH_KEY,
Properties.PREFIX + ".embedding-store.endpoint=" + AZURE_SEARCH_ENDPOINT,
Properties.PREFIX + ".embedding-store.dimensions=" + 384,
Properties.PREFIX + ".embedding-store.create-or-update-index=" + "true"
).withBean(EmbeddingModel.class, () -> embeddingModel)
.run(context -> {
EmbeddingStore embeddingStore = context.getBean(EmbeddingStore.class);
assertThat(embeddingStore).isInstanceOf(AzureAiSearchEmbeddingStore.class);
assertThat(context.getBean(AzureAiSearchEmbeddingStore.class)).isSameAs(embeddingStore);


String content1 = "banana";
String content2 = "computer";
String content3 = "apple";
String content4 = "pizza";
String content5 = "strawberry";
String content6 = "chess";
List<String> contents = asList(content1, content2, content3, content4, content5, content6);

for (String content : contents) {
TextSegment textSegment = TextSegment.from(content);
Embedding embedding = embeddingModel.embed(content).content();
embeddingStore.add(embedding, textSegment);
}
Embedding relevantEmbedding = embeddingModel.embed("fruit").content();
List<EmbeddingMatch<TextSegment>> relevant = embeddingStore.findRelevant(relevantEmbedding, 3);
assertThat(relevant).hasSize(3);
assertThat(relevant.get(0).embedding()).isNotNull();
assertThat(relevant.get(0).embedded().text()).isIn(content1, content3, content5);
});
}

}
Loading
Loading