使用milvus数据库，初始化数据库的时候，显存爆炸。 #3971

zmwstu · 2024-05-08T17:43:10Z

This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
Batches: 0%| | 0/396 [00:00<?, ?it/s]
2024-05-08 13:41:01,091 - embeddings_api.py[line:39] - ERROR: CUDA out of memory. Tried to allocate 17.93 GiB. GPU 0 has a total capacty of 23.65 GiB of which 1.62 GiB is free. Including non-PyTorch memory, this process has 22.02 GiB memory in use. Of the allocated memory 21.55 GiB is allocated by PyTorch, and 15.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
AttributeError: 'NoneType' object has no attribute 'conjugate'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/init_database.py", line 107, in
folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/migrate.py", line 128, in folder2db
files2vs(kb_name, kb_files)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/migrate.py", line 113, in files2vs
kb.add_doc(kb_file=kb_file, not_refresh_vs_cache=True)
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 131, in add_doc
doc_infos = self.do_add_doc(docs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/milvus_kb_service.py", line 83, in do_add_doc
ids = self.milvus.add_documents(docs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 119, in add_documents
return self.add_texts(texts, metadatas, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/langchain_community/vectorstores/milvus.py", line 531, in add_texts
embeddings = self.embedding_func.embed_documents(texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 439, in embed_documents
return normalize(embeddings).tolist()
^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/Code_Program/Chatchat/milvus-Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 37, in normalize
norm = np.linalg.norm(embeddings, axis=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zwm/miniconda3/envs/Langchain-Chatchat2/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 2582, in norm
s = (x.conj() * x).real
^^^^^^^^
TypeError: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

有多张显卡，不知道可以一起共用不，初始化数据库的时候，为什么需要那么大显存，是程序的问题吗？

dosubot · 2024-05-08T19:26:39Z

The CUDA out of memory error you're experiencing is likely due to the high computational demands of indexing and searching with Milvus, especially with large datasets or complex vector embeddings. To mitigate this, consider processing data in smaller batches, adjusting Milvus's index parameters for efficiency, and configuring the index_file_size to manage segment sizes. Additionally, leveraging Milvus's support for distributed deployment can help distribute the workload across multiple GPUs.

Regarding the TypeError related to a NoneType object, this error suggests that an operation is being applied to a None object, possibly due to the embedding function returning None or an unexpected None value in your data processing pipeline. Ensure your embedding model is correctly initialized, validate input data to the embedding function, and check for None values in your data processing logic.

For utilizing multiple GPUs, ensure your setup supports distributed processing, which can help manage memory usage more effectively across GPUs.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

ilovesouthpark · 2024-05-11T07:51:38Z

model config里embedding模型用的cuda吗？或者是auto？
试试cpu看，然后看看初始化是否出错。只是项目里那些sample的初始化我这里无论cpu还是cuda都能正确运行。显卡1张24g，1张10g。多张显卡的embedding的问题，前面有人讨论过你看看行不行，多显卡推理是没问题的。

zmwstu added the bug Something isn't working label May 8, 2024

zmwstu closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用milvus数据库，初始化数据库的时候，显存爆炸。 #3971

使用milvus数据库，初始化数据库的时候，显存爆炸。 #3971

zmwstu commented May 8, 2024

dosubot bot commented May 8, 2024 •

edited

Details

ilovesouthpark commented May 11, 2024

使用milvus数据库，初始化数据库的时候，显存爆炸。 #3971

使用milvus数据库，初始化数据库的时候，显存爆炸。 #3971

Comments

zmwstu commented May 8, 2024

dosubot bot commented May 8, 2024 • edited

Details

ilovesouthpark commented May 11, 2024

dosubot bot commented May 8, 2024 •

edited