NoSchedule taint for Cluster object not works #4952

chaosi-zju · 2024-05-16T13:02:44Z

What happened:

I add a NoSchedule taint to member1 Cluster as:

taints:
- effect: NoSchedule
  key: workload-rebalancer-test
  timeAdded: "2024-05-16T12:21:31Z"

Then I create a new deployment and propagate it to member1 and member2 cluster by a dynamic weight Policy whose clusterTolerations is defined as:

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

Since member1 cluster has NoSchedule taint, it should be all propagated to member2 cluster, but the actual result is both member1 and member2 cluster been propagated.

What you expected to happen:

the replicas should be all propagated to member2 cluster.

How to reproduce it (as minimally and precisely as possible):

1）add NoSchedule taint to member1 cluster

kubectl --context karmada-apiserver patch cluster member1 --type='json' -p '[{"op": "replace", "path": "/spec/taints", "value": [{"key": "workload-rebalancer-test", "effect": "NoSchedule"}]}]'

2）write following yaml to local file resource.yaml

resource.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-deploy-1
  labels:
    app: test
spec:
  replicas: 3
  selector:
    matchLabels:
      app: demo-deploy-1
  template:
    metadata:
      labels:
        app: demo-deploy-1
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - image: nginx
          name: demo-deploy-1
          resources:
            limits:
              cpu: 10m
              memory: 10Mi
---
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: default-pp
spec:
  placement:
    clusterTolerations:
      - effect: NoSchedule
        key: workload-rebalancer-test
        operator: Exists
        tolerationSeconds: 0
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        dynamicWeight: AvailableReplicas
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
      name: demo-deploy-1
      namespace: default

3）check the schedule result in binding.

I0516 13:04:56.030578       1 event.go:376] "Event occurred" object="default/demo-deploy-1" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScheduleBindingSucceed" message="Binding has been scheduled successfully. Result: {member2:2, member1:1}"

Anything else we need to know?:

Environment:

Karmada version: latest
kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version): latest
Others:

The text was updated successfully, but these errors were encountered:

chaosi-zju · 2024-05-16T13:03:45Z

We can take a look together

CC @XiShanYongYe-Chang

XiShanYongYe-Chang · 2024-05-16T13:14:00Z

I understand that this should not be expected.

dominicqi · 2024-05-23T12:39:29Z

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

chaosi-zju · 2024-05-23T13:13:48Z

Hi @dominicqi

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

No, the policy is:

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

In here, tolerationSeconds: 0 means we do not tolerate the workload-rebalancer-test:NoSchedule taint.

Since cluster member1 has workload-rebalancer-test:NoSchedule taint, and we do not tolerate it, so all replicas should be schedule to member2.

However, now, we still schedule replicas to member1 cluster, which means the taint is not work, and it is not expected.

dominicqi · 2024-05-24T02:16:20Z

Hi @chaosi-zju

Hi @dominicqi

I don't quite understand. Isn't this saying that the taint is tolerated? Scheduling it should be a normal situation, right? If we don't declare tolerance, everything will be scheduled to member2.

No, the policy is:
clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0
In here, tolerationSeconds: 0 means we do not tolerate the workload-rebalancer-test:NoSchedule taint.

Since cluster member1 has workload-rebalancer-test:NoSchedule taint, and we do not tolerate it, so all replicas should be schedule to member2.

However, now, we still schedule replicas to member1 cluster, which means the taint is not work, and it is not expected.

I understand what you are saying. One thing I am confused about is whether this has a special meaning in Karmada? The Kubernetes documentation has such a description

https://github.com/kubernetes/kubernetes/blob/8361522b40cc8b569efdd6ee2456fa514071cad1/pkg/apis/core/types.go#L3218

TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint.

chaosi-zju added the kind/bug Categorizes issue or PR as related to a bug. label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NoSchedule taint for Cluster object not works #4952

NoSchedule taint for Cluster object not works #4952

chaosi-zju commented May 16, 2024 •

edited

chaosi-zju commented May 16, 2024

XiShanYongYe-Chang commented May 16, 2024

dominicqi commented May 23, 2024

chaosi-zju commented May 23, 2024 •

edited

dominicqi commented May 24, 2024 •

edited

NoSchedule taint for Cluster object not works #4952

NoSchedule taint for Cluster object not works #4952

Comments

chaosi-zju commented May 16, 2024 • edited

chaosi-zju commented May 16, 2024

XiShanYongYe-Chang commented May 16, 2024

dominicqi commented May 23, 2024

chaosi-zju commented May 23, 2024 • edited

dominicqi commented May 24, 2024 • edited

chaosi-zju commented May 16, 2024 •

edited

chaosi-zju commented May 23, 2024 •

edited

dominicqi commented May 24, 2024 •

edited