Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] deschedule production pods between nodes #2043

Open
zwForrest opened this issue May 10, 2024 · 5 comments · May be fixed by #2066
Open

[proposal] deschedule production pods between nodes #2043

zwForrest opened this issue May 10, 2024 · 5 comments · May be fixed by #2066
Assignees
Labels
area/koord-descheduler kind/proposal Create a report to help us improve

Comments

@zwForrest
Copy link
Contributor

What is your proposal:
When the loads of the two nodes are almost the same, the production application and batch application loads of this node are also similar. We expect that the production application and the batch application will be balanced between nodes. Currently, the loadaware scheduling plugin can evaluate pod loads based on Production and Batch pod, but descheduler does not have the ability to balance based on production applications. This way the node can reduce hotspots caused by production application load.

Why is this needed:
reduce hotspots caused by production application load.

Is there a suggested solution, if so, please add it:

@zwForrest zwForrest added the kind/proposal Create a report to help us improve label May 10, 2024
@hormes
Copy link
Member

hormes commented May 10, 2024

Sounds reasonable, are you interested in participating in the development?

@songtao98
Copy link
Contributor

Great idea! Some details may need discussion:

When the loads of the two nodes are almost the same, the production application and batch application loads of this node are also similar.

Maybe we don't need to balance the loads between prod and batch workloads in a single node. Instead, the ability to balance Prod workloads on all nodes seems more valuable. (To avoid hotspot node with too many Prod workloads)

  1. How can this feature work together with LowNodeLoad.

Hope to hear more about your ideas and welcome to participate in the development!

@zwForrest
Copy link
Contributor Author

zwForrest commented May 10, 2024

  1. What I said above means balance prod workloads on all nodes.
  2. I think a possible solution is to add this feature to lowNodeLoad, because when balance the Production workload, it will also affect the node-level load. If it is split into two plugins, the LowNodeLoad plugin needs to sense the impact of the new plugin on the node load before executing the balance.

I can take it after discussing the final plan. @songtao98

@songtao98
Copy link
Contributor

Based on the discussion we have had:

  1. To implement this capability, add logic in LowNodeLoad: select nodes that has high prod-pod-load by new thresholds. e.g. prodLowThresholds and prodHighThresholds
  2. Handle abnormal nodes by:
  • If node load > highThresholds, keep the same processing logic in existing LowNodeLoad
  • else If node load <= highThresholds & prod pod load > prodHighThresholds, do prod pod eviction only
  1. Add threshold validator to ensure prodHighThresholds should not exceed highThresholds in user configuration.(Otherwise meaningless)

By this implementation, we only add one additional logic to evict prod pod when node total load is under TotalResourceThreshold but its prod-pod-load is beyond ProdResourceThreshold. If a node exceed its total load threshold and the prod pod load on it exceed its prod pod threshold at the same time, just evict pod by existing logic, i.e., pod with lower Priority will be evicted first.

/assign @zwForrest
cc @hormes @zwzhang0107 @ZiMengSheng

@songtao98
Copy link
Contributor

songtao98 commented May 11, 2024

BTW,
We've discussed about what should we do if a node exceed its total load threshold and the prod pod load on it exceed its prod pod threshold at the same time. We think it's better evict pods by existing sequence(lower Priority first, etc) and ignore dealing with Prod pod balance, comparing with evict Prod pod first and then decide if node total load is still high.

  1. Prod pod evicting needs to be handled carefully. When node total load is high, evict low Priority pods first is more reasonable.
  2. The prod pod balance will be processed in the next round LowNodeLoad executed anyway, if neccesary.

@zwForrest zwForrest linked a pull request May 27, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/koord-descheduler kind/proposal Create a report to help us improve
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants