[Ray Autoscaling] Issues related to the handling of Pending Worker Nodes when scaling down #45195

yx367563 · 2024-05-08T08:23:29Z

Description

When using KubeRay to deploy k8s cluster, if the k8s cluster resources are tight during scale up, some Worker Nodes will always be in the Pending state.
After the job execution is completed, the running Worker Nodes will be scaled down according to the configured idleTimeoutSeconds.
Then, as resources in the cluster become available, the Worker Nodes that were previously in the Pending state will be converted to the Running state and wait for idleTimeoutSeconds again to scale down.
If there are too many Worker Nodes in the Pending state, it will take a long time to scale down all the unneeded nodes and release the occupied resources when the task execution is completed, which will result in lower resource utilization.

Use case

Users may use a large maxWorkerNum and submit a large number of Ray Tasks at once when auto-scaling is enabled. According to the current auto-scaling rules, it will try to allocate Worker Nodes that can satisfy the resources required by all Tasks, and a large number of Worker Nodes will be in the Pending state when k8s resources are tight, i.e. Available WorkerNodes is a small number, while Desired WorkerNodes is a large number.
When the task is completed, the Available WorkerNodes will be scaled down according to the configured idleTimeoutSeconds, and the released resources will be occupied by the Pending Worker Nodes, which will then wait for idleTimeoutSeconds again, resulting in a long time to release the resources.

Possible solution: Remove all Worker Nodes of a certain type that is in Pending state before scaling down a Worker Node of that type due to idle, otherwise Worker Nodes of that type should not be in idle state either.

The text was updated successfully, but these errors were encountered:

jjyao · 2024-05-13T21:53:02Z

@yx367563 are you using the RayJob CRD? If so, after the job finishes, all the nodes will be deleted including the pending nodes.

yx367563 · 2024-05-14T00:56:55Z

@yx367563 are you using the RayJob CRD? If so, after the job finishes, all the nodes will be deleted including the pending nodes.

@jjyao I need to use Ray Cluster and submit jobs containing high load and low load on it. I found that when I submit a high load job, if the resources of the k8s cluster are insufficient, there will be a large number of nodes in the Pending state. The ratio of Available Node and Desired Node is about 1:10. When the task ends later, the corresponding Available Nodes will be cleared in the idle state, and then the Desired Node will gradually be converted into Available Node, resulting in a very slow scale down. I think the Pending Node at this time should be cleared directly.

yx367563 added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 8, 2024

anyscalesam added the core Issues that should be addressed in Ray Core label May 13, 2024

jjyao added P1 Issue that should be fixed within a few weeks @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ray Autoscaling] Issues related to the handling of Pending Worker Nodes when scaling down #45195

[Ray Autoscaling] Issues related to the handling of Pending Worker Nodes when scaling down #45195

yx367563 commented May 8, 2024

jjyao commented May 13, 2024

yx367563 commented May 14, 2024 •

edited

[Ray Autoscaling] Issues related to the handling of Pending Worker Nodes when scaling down #45195

[Ray Autoscaling] Issues related to the handling of Pending Worker Nodes when scaling down #45195

Comments

yx367563 commented May 8, 2024

Description

Use case

jjyao commented May 13, 2024

yx367563 commented May 14, 2024 • edited

yx367563 commented May 14, 2024 •

edited