[Bug]: docker crash when inserts more data #32716

tadinhkien99 · 2024-04-29T14:16:44Z

Is there an existing issue for this?

I have searched the existing issues

Environment

- Milvus version: 2.4.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka): docker   
- SDK version(e.g. pymilvus v2.0.0rc2): 2.4.0
- OS(Ubuntu or CentOS): window
- CPU/Memory: 64g ram
- GPU: 24g ram
- Others: docker 50gb/60gb ram

Current Behavior

When I insert upto 10M entities, docker crash then milvus disconnect. As I check because of cpu usage 100% and there no available RAM.

I use IVF_SQ8 index, each vectors 768 dimension.
I install milvus docker gpu version.
I use batchsize insert 10000 entities one time.

I think cpu and ram won't increase when we insert data?

Expected Behavior

Cpu and ram shouldn't OOM because only 10M entities

Steps To Reproduce

...

Milvus Log

...

Anything else?

...

yanliang567 · 2024-04-30T02:17:03Z

@tadinhkien99

if you are running IVF_SQ8, you don't need GPU image, try Milvus CPU image
according to my experience, for 768d vectors, please try to insert 1000 entities at a time
if it still reproduces to you, please off milvus logs for investigation. For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.

/assign @tadinhkien99
/unassign

tadinhkien99 · 2024-04-30T07:21:34Z

@tadinhkien99

if you are running IVF_SQ8, you don't need GPU image, try Milvus CPU image

according to my experience, for 768d vectors, please try to insert 1000 entities at a time

if it still reproduces to you, please off milvus logs for investigation. For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.

/assign @tadinhkien99 /unassign

@yanliang567

Since I have a GPU, I prefer using it to enhance search performance. I chose IVF_SQ8 as it's the best quantization type for limited resources.
I tried again with a batch size of 1000 entities. With IVF_SQ8, I managed to insert a total of 4.3M entities, while IVF_PQ with m=8 allowed me to insert around 10M entities before Docker crashed.
I've attached the log file below for your review.

milvus.log

yanliang567 · 2024-04-30T08:10:32Z

I did not see any critical errors when the milvus crash, I guess there is a OOM with the container. could you please double check that? @tadinhkien99

/assign @congqixia
could you please also take a look

tadinhkien99 · 2024-04-30T08:29:40Z

I did not see any critical errors when the milvus crash, I guess there is a OOM with the container. could you please double check that? @tadinhkien99

/assign @congqixia could you please also take a look
@yanliang567

I'm aware that Docker can encounter Out of Memory errors, but in this instance, I was merely adding entities into the system without conducting any searches. What could be causing the OOM issue under these circumstances?

Additionally, I need advice on the most effective index type for handling larger datasets. I have approximately 50 million entities to insert. I initially used IVF_PQ, but encountered an OOM error after inserting only 10 million entities. What would you recommend?

xiaofan-luan · 2024-05-05T03:19:19Z

are u using GPU index or cpu index?
if it's cpu index,I believe 50GB memory is far more enough than 10m data.
when you using IVFSQ8, why specify m? m is only for HNSW index.

tadinhkien99 · 2024-05-05T13:08:04Z

are u using GPU index or cpu index?

if it's cpu index,I believe 50GB memory is far more enough than 10m data.

when you using IVFSQ8, why specify m? m is only for HNSW index.

I use cpu index type. But OOM on ram memory.
I deleted m param.

Now I use gpu cagra index rtx 4090 24gb ram. And it's fine for 4M entities.
Do you have any ideas to optimize milvus.yaml (2.4)?

xiaofan-luan · 2024-05-05T13:26:36Z

what part you need top optimize with milvus.yaml?

tadinhkien99 · 2024-05-05T13:36:56Z

I want to use gpu index to save more and more entities. Around > 10M entities.
Also where I can setup run multiple gpus?
Thanks.

xiaofan-luan · 2024-05-06T04:06:52Z

you can use more GPU devices on single machine.

I think the document already cover multi device use case https://milvus.io/docs/install_standalone-helm-gpu.md

@Presburger can help if you hit any issue

tadinhkien99 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 29, 2024

tadinhkien99 assigned yanliang567 Apr 29, 2024

sre-ci-robot assigned tadinhkien99 and unassigned yanliang567 Apr 30, 2024

yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 30, 2024

sre-ci-robot assigned congqixia Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: docker crash when inserts more data #32716

[Bug]: docker crash when inserts more data #32716

tadinhkien99 commented Apr 29, 2024

yanliang567 commented Apr 30, 2024

tadinhkien99 commented Apr 30, 2024

yanliang567 commented Apr 30, 2024

tadinhkien99 commented Apr 30, 2024

xiaofan-luan commented May 5, 2024

tadinhkien99 commented May 5, 2024

xiaofan-luan commented May 5, 2024

tadinhkien99 commented May 5, 2024

xiaofan-luan commented May 6, 2024

[Bug]: docker crash when inserts more data #32716

[Bug]: docker crash when inserts more data #32716

Comments

tadinhkien99 commented Apr 29, 2024

Is there an existing issue for this?

Environment

Current Behavior

Expected Behavior

Steps To Reproduce

Milvus Log

Anything else?

yanliang567 commented Apr 30, 2024

tadinhkien99 commented Apr 30, 2024

yanliang567 commented Apr 30, 2024

tadinhkien99 commented Apr 30, 2024

xiaofan-luan commented May 5, 2024

tadinhkien99 commented May 5, 2024

xiaofan-luan commented May 5, 2024

tadinhkien99 commented May 5, 2024

xiaofan-luan commented May 6, 2024