Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support running workloads in visibility endpoint #2145

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

KunWuLuan
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

support running workloads in visibility endpoint

Which issue(s) this PR fixes:

Fixes #1776

Special notes for your reviewer:

Does this PR introduce a user-facing change?

We will add a new visibility endpoint in Kueue. Users can list the running workloads by using 
'kubectl get --raw "/apis/visibility.kueue.x-k8s.io/v1alpha1/clusterqueues/cluster-queue/runningworkloads"' 
and 
'kubectl get --raw "/apis/visibility.kueue.x-k8s.io/v1alpha1/namespaces/default/localqueues/user-queue/runningworkloads"'

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 7, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: KunWuLuan
Once this PR has been reviewed and has the lgtm label, please assign alculquicondor for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 7, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @KunWuLuan. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 7, 2024
Copy link

netlify bot commented May 7, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 8d7f709
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/664c95d26939a50008f1a91a

@KunWuLuan KunWuLuan changed the title support running workloads in visibility endpoint WIP: support running workloads in visibility endpoint May 7, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2024
@KunWuLuan KunWuLuan force-pushed the kep/list-workloads-admitted-not-finished branch from da55391 to 6cb9546 Compare May 7, 2024 02:17
@KunWuLuan KunWuLuan changed the title WIP: support running workloads in visibility endpoint support running workloads in visibility endpoint May 7, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2024
@mimowo
Copy link
Contributor

mimowo commented May 7, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 7, 2024
@KunWuLuan KunWuLuan force-pushed the kep/list-workloads-admitted-not-finished branch 3 times, most recently from 96d5e5b to a3b368b Compare May 7, 2024 11:20
@mimowo
Copy link
Contributor

mimowo commented May 7, 2024

/cc @PBundyra

@k8s-ci-robot k8s-ci-robot requested a review from PBundyra May 7, 2024 15:58
@KunWuLuan
Copy link
Contributor Author

do we need to query running workloads for local queues?

Copy link
Contributor

@PBundyra PBundyra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

My first pass.

Please add e2e/integration tests and update KEP regarding that. Please add cases which cover:

  • pending and running workloads are assigned to a CQ
  • a workload changes its status from pending to admitted
  • a workload changes its status from admitted to finished

apis/visibility/v1alpha1/types.go Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/test_utils.go Outdated Show resolved Hide resolved
keps/2145-list-admitted-workloads/README.md Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq.go Outdated Show resolved Hide resolved
pkg/visibility/api/rest/running_workloads_cq_test.go Outdated Show resolved Hide resolved
@PBundyra
Copy link
Contributor

PBundyra commented May 8, 2024

do we need to query running workloads for local queues?

I am leaning towards implementing that. It will be useful for the KueueCtl. It might be a separate PR though
@alculquicondor @mwielgus WDYT?

pkg/cache/cache.go Outdated Show resolved Hide resolved
@alculquicondor
Copy link
Contributor

do we need to query running workloads for local queues?

I am leaning towards implementing that. It will be useful for the KueueCtl. It might be a separate PR though

Yes, a separate PR would be better.

@KunWuLuan KunWuLuan force-pushed the kep/list-workloads-admitted-not-finished branch 2 times, most recently from 646f495 to f8134df Compare May 14, 2024 11:01
Signed-off-by: KunWuLuan <[email protected]>
@KunWuLuan KunWuLuan force-pushed the kep/list-workloads-admitted-not-finished branch from f8134df to b17c351 Compare May 14, 2024 11:29
@PBundyra
Copy link
Contributor

Thank you for addressing my comments. Please add missing test scenarios.

Signed-off-by: KunWuLuan <[email protected]>
@KunWuLuan KunWuLuan force-pushed the kep/list-workloads-admitted-not-finished branch from fb688a2 to 4460d3e Compare May 15, 2024 12:52
@KunWuLuan
Copy link
Contributor Author

@PBundyra Hi, I think all questions have been addressed. Please have a look if you have time. Thanks very much.

@@ -83,7 +128,7 @@ implementing this enhancement to ensure the enhancements have also solid foundat

#### Unit Tests

New unit tests should be added testing the functionality for jobs and pods.
New unit tests should be added testing the functionality for new api.

#### Integration tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this section


### Goals

* Support list workloads that are admitted and not finished.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Support list workloads that are admitted and not finished.
* Support listing running workloads in a ClusterQueue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add information about the LocalQueue endpoint, even if it won't be implemented immediately please.

Please add reference to this KEP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

kubectl get --raw "/apis/visibility.kueue.x-k8s.io/v1alpha1/clusterqueues/cluster-queue/runningworkloads"
```

We will show priority and localqueue information in response. Like this:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update

Comment on lines +38 to +97
We will add a new visibility endpoint in Kueue.
``` go
// RunningWorkload is a user-facing representation of a running workload that summarizes the relevant information for
// assumed resources in the cluster queue.
type RunningWorkload struct {
metav1.ObjectMeta `json:"metadata,omitempty"`

// Priority indicates the workload's priority
Priority int32 `json:"priority"`
// AdmissionTime indecates the time workloads admitted
AdmissionTime metav1.Time `json:"admissionTime"`
}

// +k8s:openapi-gen=true
// +kubebuilder:object:root=true

// RunningWorkloadsSummary contains a list of running workloads in the context
// of the query (within LocalQueue or ClusterQueue).
type RunningWorkloadsSummary struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Items []RunningWorkload `json:"items"`
}

// +kubebuilder:object:root=true
type RunningWorkloadsSummaryList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`

Items []RunningWorkloadsSummary `json:"items"`
}

// +kubebuilder:object:root=true
// +k8s:openapi-gen=true
// +k8s:conversion-gen:explicit-from=net/url.Values
// +k8s:defaulter-gen=true

// RunningWorkloadOptions are query params used in the visibility queries
type RunningWorkloadOptions struct {
metav1.TypeMeta `json:",inline"`

// Offset indicates position of the first pending workload that should be fetched, starting from 0. 0 by default
Offset int64 `json:"offset"`

// Limit indicates max number of pending workloads that should be fetched. 1000 by default
Limit int64 `json:"limit,omitempty"`
}
```

Users can list the running workloads by using
``` bash
kubectl get --raw "/apis/visibility.kueue.x-k8s.io/v1alpha1/clusterqueues/cluster-queue/runningworkloads"
```

We will show priority and localqueue information in response. Like this:
```
{"kind":"RunningWorkloadsSummary","apiVersion":"visibility.kueue.x-k8s.io/v1alpha1","metadata":{"creationTimestamp":null},"items":[{"metadata":{"name":"job-sample-job-jz228-ef938","namespace":"default","creationTimestamp":"2024-05-06T02:15:26Z","ownerReferences":[{"apiVersion":"batch/v1","kind":"Job","name":"sample-job-jz228","uid":"2de8a359-4c95-4159-b677-0279066149b6"}]},"priority":0,"admissionTime":"xxxx"}]}
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this section to Design Details, and add a REST path to the endpoint (not only kubectl usage)

status: draft
creation-date: 2024-04-29
reviewers:
- "@alculquicondor"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "@alculquicondor"
- "@alculquicondor"
- "@pbundyra"

@@ -0,0 +1,22 @@
title: List Admitted And Not Finished Workloads
kep-number: 1834
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update

Comment on lines +12 to +22
stage: beta

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v0.9"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v0.9"
beta: "v0.9"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update

@PBundyra
Copy link
Contributor

PBundyra commented May 22, 2024

@PBundyra Hi, I think all questions have been addressed. Please have a look if you have time. Thanks very much.

Thank you for addressing my comments. Please add e2e/integration tests as I've mentioned above. You can extend existing e2e tests.

Another thing we're missing here are RBAC roles. See this file for the reference.

Thank you

## Motivation

Jsonpath support in kubectl is limited, and we can not filter resources by condition in this way. By adding a new
visibility endpoint, users can list running workloads by `kubectl get`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is already being worked on in kueuectl list workloads (see: https://github.com/kubernetes-sigs/kueue/blob/8d7f709225bb828c38758a22ccf892f304b1bd0c/keps/2076-kueuectl/README.md#list-workloads
and
#2195). Is there another reason why we might need this extra api?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KunWuLuan before we go further with this PR could you please address that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A mechanism to list the Workloads that are admitted and not finished
6 participants