-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] GPU Feature discovery label formatter #3493
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @asaiacai! It looks very reasonable to me. @romilbhardwaj for another look to make sure it does not break our other formatters : )
Co-authored-by: Zhanghao Wu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai. Tested on A100 and H100 from Lambda. Left a comment about documenting that this labelformatter cannot be used with autoscaling, otherwise lgtm!
just added the docstring @romilbhardwaj . Thanks for the review! lmk if i this needs anything else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai!
Resolves #2460
This allows k8s to consume the node label
nvidia.com/gpu.product
created by GPU feature discovery which is commonly deployed through the NVIDIA GPU operatorTested (run the relevant ones):
bash format.sh
eks_test_cluster.yaml
k3s
withgpu-operator
usingdeploy_k3s.sh
modified to exclude the skypilot k8s labeler, ensure the following can run