Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Milvus process load operation in async fashion taking 3+ seconds #32905

Open
1 task done
prrs opened this issue May 9, 2024 · 7 comments
Open
1 task done

[Bug]: Milvus process load operation in async fashion taking 3+ seconds #32905

prrs opened this issue May 9, 2024 · 7 comments
Assignees
Labels
kind/improvement Changes related to something improve, likes ut and code refactor

Comments

@prrs
Copy link

prrs commented May 9, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.4.1
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): kafka
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: 32/128GB
- GPU: No
- Others: n/a

Current Behavior

  1. When a call is made from the client SDK, it loads the partition and returns a response.
  2. The client starts polling to check the load status.
  3. Milvus, running a checker on ~3 seconds interval, kicks in and checks for all the segments that need to be loaded.
  4. It loads the segments asynchronously.

Expected Behavior

There should be way through time to trigger the load segment should be minimised. It could be an explicit API or Milvus capability can be enahnced to minimise it.
Screenshot 2024-05-08 at 9 37 56 PM
Screenshot 2024-05-08 at 9 57 23 PM
segmentloading.csv
loadcall.csv

I have attached two screenshots from proxy logs, where wen can see that milvus client SDK is polling after load.

Steps To Reproduce

1. Make a load call from the Milvus client SDK.
2. We can observe the sequence of calls in the proxy logs, where the load progress is initiated from the client SDK.
3. In the Milvus server log, it can be seen that it takes a few seconds (approximately 2-3 seconds) to trigger the segment load asynchronously after approximately 2-3 seconds.

Milvus Log

segmentloading.csv
loadcall.csv

Loadcall.csv - contains logs when load request is being made
segmentloading.csv - contains the logs where segment loading has been triggered in async to load the collection

Anything else?

na

@prrs prrs added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 9, 2024
@prrs
Copy link
Author

prrs commented May 9, 2024

@xiaofan-luan @yiwangdr ^^

@yanliang567
Copy link
Contributor

sounds like a enhancement request
/assign @xiaofan-luan

@yanliang567 yanliang567 added kind/improvement Changes related to something improve, likes ut and code refactor and removed kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 10, 2024
@yanliang567 yanliang567 assigned yiwangdr and unassigned yanliang567 May 11, 2024
@prrs
Copy link
Author

prrs commented May 15, 2024

@yiwangdr This is important to meet our SLAs. Do we have a timeline for this to available? Thanks. cc: @xiaofan-luan

@xiaofan-luan
Copy link
Contributor

@yiwangdr This is important to meet our SLAs. Do we have a timeline for this to available? Thanks. cc: @xiaofan-luan

I have a small optimization on it but load could takes a couple of seconds.

@prrs
Copy link
Author

prrs commented May 28, 2024

@xiaofan-luan Do we have an understanding that in what scenario it could take 2-3+ seconds? Yi pointed me here, the observer is scheduled at 1s. So, where time is going and how are proposing to fix it?

@xiaofan-luan
Copy link
Contributor

the segment info stats collection takes 3s.
so even load succeed, querycoord takes 3s to know it

@xiaofan-luan
Copy link
Contributor

we can not reduce this to very smaller number because this might cost cpus on cluster with many collections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement Changes related to something improve, likes ut and code refactor
Projects
None yet
Development

No branches or pull requests

4 participants