Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG faiss AssertionError #1239

Closed
AprilCat opened this issue Apr 30, 2024 · 2 comments
Closed

RAG faiss AssertionError #1239

AprilCat opened this issue Apr 30, 2024 · 2 comments

Comments

@AprilCat
Copy link

Bug description

execute this demo

import asyncio

from metagpt.rag.engines import SimpleEngine
from metagpt.rag.schema import FAISSRetrieverConfig
from metagpt.const import EXAMPLE_DATA_PATH

DOC_PATH = EXAMPLE_DATA_PATH / "rag/travel.txt"

async def main():
    engine = SimpleEngine.from_docs(input_files=[DOC_PATH], retriever_configs=[FAISSRetrieverConfig()])

    answer = await engine.aquery("What does Bob like?")
    print(answer)

if __name__ == "__main__":
    asyncio.run(main())

get error

Traceback (most recent call last):
  File "/home/wanfu/projects/llm/multi_agent_rag/src/simple_custom_object.py", line 26, in <module>
    asyncio.run(main())
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/wanfu/projects/llm/multi_agent_rag/src/simple_custom_object.py", line 21, in main
    engine.add_docs([DOC_PATH])
  File "/mnt/data/work/development/projects/llm/MetaGPT/metagpt/rag/engines/simple.py", line 195, in add_docs
    self._save_nodes(nodes)
  File "/mnt/data/work/development/projects/llm/MetaGPT/metagpt/rag/engines/simple.py", line 274, in _save_nodes
    self.retriever.add_nodes(nodes)
  File "/mnt/data/work/development/projects/llm/MetaGPT/metagpt/rag/retrievers/faiss_retriever.py", line 12, in add_nodes
    self._index.insert_nodes(nodes, **kwargs)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/site-packages/llama_index/core/indices/vector_store/base.py", line 320, in insert_nodes
    self._insert(nodes, **insert_kwargs)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/site-packages/llama_index/core/indices/vector_store/base.py", line 311, in _insert
    self._add_nodes_to_index(self._index_struct, nodes, **insert_kwargs)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/site-packages/llama_index/core/indices/vector_store/base.py", line 233, in _add_nodes_to_index
    new_ids = self._vector_store.add(nodes_batch, **insert_kwargs)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/site-packages/llama_index/vector_stores/faiss/base.py", line 121, in add
    self._faiss_index.add(text_embedding_np)
  File "/home/wanfu/data/miniconda3/envs/metagpt/lib/python3.9/site-packages/faiss/__init__.py", line 214, in replacement_add
    assert d == self.d
AssertionError

Bug solved method

Environment information

  • LLM type and model name: zhipuai
  • Embeddings : fastchat, BAAI/bge-large-zh
  • System version: Ubuntu 22.04
  • Python version: 3.9.19
  • MetaGPT version or branch:
  • packages version:
  • installation method: pip install from source

Screenshots or logs

@usamimeri
Copy link
Contributor

I noticed that the assertion failed because of d != self.d meaning that the dimension of the embedded vector didn't match the dimension of your embedding model(in your case it's 1024)
If your embedding model isn't from ollama or gemini, the embedding size will default to 1536 which is the dimension of openai embedding.
https://github.com/geekan/MetaGPT/blob/main/metagpt/rag/schema.py#L34-L49

@usamimeri
Copy link
Contributor

usamimeri commented May 1, 2024

you can check this similar issue #1213

and change engine = SimpleEngine.from_docs(input_files=[DOC_PATH], retriever_configs=[FAISSRetrieverConfig()]) to engine = SimpleEngine.from_docs(input_files=[DOC_PATH], retriever_configs=[FAISSRetrieverConfig(dimensions=1024)])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants