Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeaviateHybridSearchRetriever isn't working with weaviate cliient v4 #21147

Closed
5 tasks done
elieobeid7 opened this issue May 1, 2024 · 4 comments
Closed
5 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: retriever Related to retriever module 🔌: weaviate Primarily related to Weaviate vector store integration

Comments

@elieobeid7
Copy link

elieobeid7 commented May 1, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import weaviate

from langchain_community.retrievers import (
    WeaviateHybridSearchRetriever,
)
from langchain_core.documents import Document

from config import OPENAI_API_KEY, WEAVIATE_HOST, WEAVIATE_PORT

headers = {
    "X-Openai-Api-Key": OPENAI_API_KEY,
}

client = weaviate.connect_to_local(headers=headers)


retriever = WeaviateHybridSearchRetriever(
    client=client,
    index_name="LangChain",
    text_key="text",
    attributes=[],
    create_schema_if_missing=True,
)
docs = [
    Document(
        metadata={
            "title": "Embracing The Future: AI Unveiled",
            "author": "Dr. Rebecca Simmons",
        },
        page_content="A comprehensive analysis of the evolution of artificial intelligence, from its inception to its future prospects. Dr. Simmons covers ethical considerations, potentials, and threats posed by AI.",
    )
]

retriever.add_documents(docs)

answer = retriever.invoke("the ethical implications of AI")
print(answer)

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "main.py", line 17, in <module>
    retriever = WeaviateHybridSearchRetriever(
  File "venv\lib\site-packages\langchain_core\load\serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "venv\lib\site-packages\pydantic\v1\main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for WeaviateHybridSearchRetriever
__root__
  client should be an instance of weaviate.Client, got <class 'weaviate.client.WeaviateClient'> (type=value_error)
sys:1: ResourceWarning: unclosed <socket.socket fd=880, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=0, laddr=('::1', 64509, 0, 0), raddr=('::1', 8080, 0, 0)>

Description

windows 11, I'm trying to use WeaviateHybridSearchRetriever with Weaviate client v4 since v3 is deprecated.

System Info

aiohttp==3.9.5
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.3.0
async-timeout==4.0.3
attrs==23.2.0
Authlib==1.3.0
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
cryptography==42.0.5
dataclasses-json==0.6.5
exceptiongroup==1.2.1
frozenlist==1.4.1
greenlet==3.0.3
grpcio==1.63.0
grpcio-health-checking==1.63.0
grpcio-tools==1.63.0
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
idna==3.7
jsonpatch==1.33
jsonpointer==2.4
langchain==0.1.17
langchain-community==0.0.36
langchain-core==0.1.48
langchain-text-splitters==0.0.1
langsmith==0.1.52
marshmallow==3.21.1
multidict==6.0.5
mypy-extensions==1.0.0
numpy==1.26.4
orjson==3.10.2
packaging==23.2
protobuf==5.26.1
pycparser==2.22
pydantic==2.7.1
pydantic_core==2.18.2
PyYAML==6.0.1
requests==2.31.0
sniffio==1.3.1
SQLAlchemy==2.0.29
tenacity==8.2.3
typing-inspect==0.9.0
typing_extensions==4.11.0
urllib3==2.2.1
validators==0.28.1
weaviate-client==4.5.7
yarl==1.9.4
@dosubot dosubot bot added Ɑ: retriever Related to retriever module 🔌: weaviate Primarily related to Weaviate vector store integration 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels May 1, 2024
@Sachin-Bhat
Copy link

Hey @elieobeid7 maybe you should try the python package langchain-weaviate. This may help solve the issue.

@elieobeid7
Copy link
Author

@Sachin-Bhat I just inspected the source code of that package https://github.com/langchain-ai/langchain-weaviate/tree/main/libs/weaviate

  • There's no mention of hybridsearch, if you search for the keyword hybrid you find nothing
  • No documentation whatsoever, it links too the official langchain and weaviate docs where the weaviate package is used instead

So can't use it, I'd rather stick with weaviate v3 client and follow the official docs rather than waste time trying to understand how it works. In any case it doesn't even have hybrid search as I previously said.

@elieobeid7 elieobeid7 changed the title WeaviateHybridSearchRetriever isn't working with weaviate cliient v4 WeaviateHybridSearchRetriever isn't working with weaviate cliient v4 May 4, 2024
@StreetLamb
Copy link
Contributor

Hi @elieobeid7, I took a look at the source code and there's a reference to hybrid search here. The langchain docs also states that similarity_search uses Weaviate hybrid search as can be seen here. Hope this helps.

@hsm207
Copy link
Contributor

hsm207 commented May 7, 2024

Hi @elieobeid7,

I'm the maintainer of the langchain-weaviate integration and can confirm what @StreetLamb said.

Hybrid search is supported in v4, just not through the WeaviateHybridSearchRetriever class.

It has been consolidated into the similarity_search function. By default, it does 50:50 bm25 and vector search. Users can pass the arg alpha to it such that 0 means pure BM25 search, and 1 means pure vector search.

@efriis efriis closed this as completed May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: retriever Related to retriever module 🔌: weaviate Primarily related to Weaviate vector store integration
Projects
None yet
Development

No branches or pull requests

5 participants