-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that pods are scheduled to nodes that meet preferred conditions, while satisfying a series of filter plugins for the scheduler. #124844
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
I assuming that your goal is to (try to)make sure a pod with preferred affinity and taint toleration to be scheduled to a node which matches node affinity and also has the tolerated taint? |
In the case of available pod resources, I want pods to be scheduled onto specific nodes as much as possible. However, the numerous score plugins enabled in the cluster, along with their predefined weights set by SREs, make it challenging for users to dynamically adjust them. Meanwhile, due to performance considerations, the scheduler only traverses and evaluates a subset of nodes. This often leads to suboptimal scheduling results. |
I get the point that this is trying to get ideal score result. But since the scheduler never guarantees that the pod will be scheduled to the node with the highest score, I'm still confused why this is needed(if you really want to match the node affinity, why not using Anyway, I think you can write a simple doc, and put it on the agenda of sig-scheduling(https://github.com/kubernetes/community/tree/master/sig-scheduling). Folks can have a discussion during the meeting then. |
Okay, thank you. I understand your confusion. My main goal is to ensure that pods are always scheduled to preferred nodes first, rather than partial preferred, while meeting resource requirements |
/cc |
What would you like to be added?
/sig scheduling
/kind feature
Add a new plugin extension to check nodes. Then modify the scheduling filter logic to prioritize nodes that satisfy preferred check conditions, ensuring that these nodes are placed at the beginning of the node array to ensure that the scheduler prioritizes them during each scheduling attempt.
If the community feels this requirement is necessary, I will complete the corresponding KEP and code implementation work.
The current solution within our company is like this, but I believe adding a 'check preferred' extension point would be better:
1、Enable users to assign a specific annotation to pods with the key "xxx.k8s.io/preferred-plugin". The value of this annotation can be either "NodeAffinity" or "TaintToleration".
2、Determine which preferred feature to utilize during scheduling based on the annotation value.
NodeAffinity:
TaintToleration:
3、 Divide nodes into two groups, "passChecked" and "noPassChecked", based on whether they satisfy the preferred check.
4、To ensure equal scheduling probabilities for each node, randomly sort the "passChecked" and "noPassChecked" groups.
5、Reconstruct the nodes array by combining the "passChecked" and "noPassChecked" groups, ensuring that "passChecked" nodes come before "noPassChecked" nodes.
6、Call the "findNodesThatPassFilters" method to search for feasible nodes in the new nodes array.
7、If the length of "passChecked" is 0, adjust the value of "nextStartNodeIndex"; otherwise, leave it unchanged.
Why is this needed?
Currently, for performance reasons, the kube-scheduler follows this scheduling logic:
1、It starts filtering feasible nodes from the nextStartNodeIndex. It stops filtering after a specific number of nodes are filtered out that satisfy the Filter plugin (by default, this number is 100).
2、Then, it applies Score plugins to assign scores to these feasible nodes.
3、Finally, it selects the node with the highest score for scheduling.
However, because each scheduling attempt operates within a partial range and there are multiple Score plugins, this often results in pods not being scheduled onto the nodes users expect.
If we can add an new extension to check nodes, then we can prioritize scheduling pods onto the desired nodes.
The text was updated successfully, but these errors were encountered: