Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GracefulEvictionTasks of ClusterRole never execute #4951

Closed
chaosi-zju opened this issue May 16, 2024 · 6 comments
Closed

GracefulEvictionTasks of ClusterRole never execute #4951

chaosi-zju opened this issue May 16, 2024 · 6 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@chaosi-zju
Copy link
Member

chaosi-zju commented May 16, 2024

What happened:

I propagated a ClusterRole to member1 and member2 cluster, the bound Policy specified the clusterTolerations as

clusterTolerations:
- effect: NoSchedule
  key: workload-rebalancer-test
  operator: Exists
  tolerationSeconds: 0

Then I modified member1 cluster object and add a NoSchedule taint to member1 cluster object as

taints:
- effect: NoExecute
  key: workload-rebalancer-test
  timeAdded: "2024-05-16T12:21:31Z"

The the binding of it generated a gracefulEvictionTasks.

spec:
  clusters:
  - name: member2
  gracefulEvictionTasks:
  - creationTimestamp: "2024-05-16T12:21:31Z"
    fromCluster: member1
    producer: TaintManager
    reason: TaintUntolerated

However, the gracefulEvictionTasks will never be cleared.

What you expected to happen:

The gracefulEvictionTasks of its binding should be cleared when failover-eviction-timeout timeup.

Notes: binding of a Deployment would works normal, but binding of ClusterRole/ConfigMap not.

How to reproduce it (as minimally and precisely as possible):

1)modify controller-manager launch params failover-eviction-timeout.

$ kubectl --context karmada-host patch deploy karmada-controller-manager -n karmada-system --type='json' -p '[{"op": "replace", "path": "/spec/template/spec/containers/0/command/5", "value": "--failover-eviction-timeout=3s"}]'

2)write following ClusterRole yaml to local file resource.yaml.

resource.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: demo-role
rules:
  - apiGroups:
      - '*'
    resources:
      - '*'
    verbs:
      - '*'
---
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: default-pp
spec:
  placement:
    clusterTolerations:
      - effect: NoSchedule
        key: workload-rebalancer-test
        operator: Exists
        tolerationSeconds: 0
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        dynamicWeight: AvailableReplicas
  resourceSelectors:
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      name: demo-role

3)run following commands by steps.

$ kubectl --context karmada-apiserver apply -f resource.yaml
clusterrole.rbac.authorization.k8s.io/demo-role created
clusterpropagationpolicy.policy.karmada.io/default-pp created

$ kubectl --context karmada-apiserver patch cluster member1 --type='json' -p '[{"op": "replace", "path": "/spec/taints", "value": [{"key": "workload-rebalancer-test", "effect": "NoExecute"}]}]'
cluster.cluster.karmada.io/member1 patched

4)check its ClusterResourecBinding and found gracefulEvictionTasks will never be removed.

$ kubectl --context karmada-apiserver get crb demo-role-clusterrole -o yaml
...
spec:
  clusters:
  - name: member2
  conflictResolution: Abort
  gracefulEvictionTasks:
  - creationTimestamp: "2024-05-16T12:21:31Z"
    fromCluster: member1
    producer: TaintManager
    reason: TaintUntolerated
....

Anything else we need to know?:

Environment:

  • Karmada version: latest
  • kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version): latest
  • Others:
@chaosi-zju chaosi-zju added the kind/bug Categorizes issue or PR as related to a bug. label May 16, 2024
@chaosi-zju
Copy link
Member Author

CC @XiShanYongYe-Chang

@chaosi-zju
Copy link
Member Author

Not only CLusterRole, but also ConfigMap, you can also test ConfigMap by following yaml:

resource.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: demo-test
  labels:
    app: test
data:
  test-key: "test-value"
---
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: default-pp
spec:
  placement:
    clusterTolerations:
      - effect: NoSchedule
        key: workload-rebalancer-test
        operator: Exists
        tolerationSeconds: 0
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        dynamicWeight: AvailableReplicas
  resourceSelectors:
    - apiVersion: v1
      kind: ConfigMap
      name: demo-test
      namespace: default

@XiShanYongYe-Chang
Copy link
Member

/assign

@XiShanYongYe-Chang
Copy link
Member

First of all, let's come to the conclusion that this is not a bug.

There are two reasons for the problem described in the issue.

First, We don't have a default InterpretHealth resource interpretation behavior for ClusterRole/ConfigMap resources, so the cluster in the gracefunEvictionTasks will wait for the timeout.
Second, The timeout period is specified by the graceful-eviction-timeout parameter of the karmada-controller. When I update it to 10s, It was cleared in 10s.

@XiShanYongYe-Chang
Copy link
Member

Can we close it now?

@chaosi-zju
Copy link
Member Author

yes, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: No status
Development

No branches or pull requests

2 participants