New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: Memory / sizing issues with KnowledgeGraphIndex and embedding #13104
Comments
To address the memory and sizing issues with the KnowledgeGraphIndex when
Regarding the serialization of For reducing the precision of embeddings from The LlamaIndex codebase includes mechanisms for lazy loading of embeddings, allowing embeddings to be loaded on-demand. This approach provides a flexible and extensible mechanism for managing various embedding types within the framework. By implementing these strategies, you can better manage the memory and sizing issues associated with storing and processing large embedding dictionaries in a KnowledgeGraphIndex, especially when backed by a PostgreSQL database.
|
@stdweird the real solution here is not store the embeddings in memory, and instead use an actual vector db (although this would take some refactor work) This would separate out the embeddings from the hacky location they are in now |
The storage context already have a vector store, so technically it should be storing and retrieving vectors from that. Then, you could integrate any vector store (qdrant, chroma, etc.) |
Question Validation
Question
We are trying to build a KnowledgeGraphIndex with
include_embeddings=True
. Current setup uses a postgres backed KV store as index store.We are being hit by sizing in 2 places more or less at the same time from the same source: the
index_struct.embedding_dict
.First issue is internal and most likely manageable (we are working on a patch): llama_index uses
List[float]
to store embeddings, however, these things use approx 32byte per float. We have for testing ~10k nodes to index, resulting in 40-50k triplets, and with eg openai ada2 embedding of dim 1500, this grows out of control. Fix here is to use numpy arrays with float32 or even float16 (which should be ok for embeddings), reducing the internal size by at least factor 8. This will be manageable, but not ideal (if we use float16, this might be ok). Additional advantage might be that also the regular nodes with embeddings will use less space if this was also used there. (our testbed has lots of memory, so this is more of an operational issue: in production, we don't want to give chatbots 10s of GB of ram, solely to store some embedding data; but again float16 to the rescue).However, even if we address the internal memory issue, problem nr 2 is the store: the
embedding_dict
is now stored as part of a single index_struct in the index store; so it serialises a very large object to json, and sends it to postgres. postgres has a row limit of 1GB (we discovered; i still need to try the jsonb 256MB limit); but this is not enough. we are a bit in the dark what to do with this one. can one shard the index_struct somehow? do we store theembeddings_dict
in it's own embeddings kvstore (i think we only need a get and put)? the code also mentions "TBD, should support vector store", but that is beyond my patching skills ;)@logan-markewich you mentioned you are working on new graphindex code, but any ideas what we could try in the meantime?
The text was updated successfully, but these errors were encountered: