Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Kueue. #1754

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shrinandj
Copy link
Contributor

This commit adds support for using Kueue to submit jobs/pods into Kubernetes. There are two config options:

  • KUEUE_ENABLED: set to True/False
  • KUEUE_LOCALQUEUE_NAME: set to the name of the localqueue configured with Kueue. See this for details

The config options can be set in the main metaflow config or via the @kubernetes decorator.

Testing Done:

  • Verified that specifying kueue config options in Metaflow config (~/.metaflowconfig/json) works as expected.

  • Verified that specifying kueue config options in @kubernetes works as expected

  • Verified that @kubernetes options take precedence over the global config

    • If the global KUEUE_ENABLED config is True, but locally set to False for a particular step, the step does not run with Kueue.
  • Verified that the kueue labels and annotations are set correctly and kueue actually runs the jobs.

  • Verified that if kueue is configured to manage "pod", Metaflow create argo-workflow pods are scheduled by kueue.

  • Verified that the default behavior is to not use Kueue and everything works correctly as before (jobs and argo-workflows)

@romain-intel
Copy link
Contributor

Mergeable anytime from my end -- no impact on core.

This commit adds support for using Kueue to submit jobs/pods into Kubernetes.
There are two config options:
- KUEUE_ENABLED: set to True/False
- KUEUE_LOCALQUEUE_NAME: set to the name of the localqueue configured with Kueue.
  See [this](https://kueue.sigs.k8s.io/docs/concepts/local_queue) for details

The config options can be set in the main metaflow config or via the
@kubernetes decorator.

Testing Done:

- Verified that specifying kueue config options in Metaflow config (~/.metaflowconfig/json) works as expected.
- Verified that specifying kueue config options in @kubernetes works as expected
- Verified that @kubernetes options take precedence over the global config
   - If the global KUEUE_ENABLED config is True, but locally set to False for
     a particular step, the step does not run with Kueue.
- Verified that the kueue labels and annotations are set correctly and kueue actually
  runs the jobs.
- Verified that if kueue is configured to manage "pod", Metaflow create argo-workflow
  pods are scheduled by kueue.

- Verified that the default behavior is to not use Kueue and everything works
  correctly as before (jobs and argo-workflows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants