-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
capping resources assigned to each model in multi model serving #2132
Comments
Thank you @singhniraj08, |
Configuring to limit CPU usage/core per model in mutli-model setup is currently not in our roadmap right now. But this sounds like a good feature to implement. I will keep this as a feature request and discuss internally within team for implementation. Once we have an update, we will update this thread. |
Is there a way to cap the number (e.g. CPU cores, CUDA MPS threads) of resources assigned to each model in a multi-model tensorflow server?
The only way (straightforward way and not considering lower-level tools like cpu limits), I can think of resource allocation to the microservices (like model servers) is containerization or VMs, so I think there isn’t such an option. Is that true?
The text was updated successfully, but these errors were encountered: