Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-scheduler doesn't work properly after reboot #11134

Open
akoken opened this issue Apr 29, 2024 · 4 comments
Open

kube-scheduler doesn't work properly after reboot #11134

akoken opened this issue Apr 29, 2024 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@akoken
Copy link

akoken commented Apr 29, 2024

What happened?

I successfully installed a Kubernetes cluster on my RHEL servers. However, kube-scheduler does not work properly after rebooting the master node. It doesn't clean up completed jobs and terminated pods. I installed two different clusters, and both of them have the same issue. kube-scheduler logs show that it cannot access some resources, but system:kube-scheduler looks good to me, though.

kube-scheduler logs
I0426 12:28:37.152855       1 serving.go:348] Generated self-signed cert in-memory
W0426 12:28:39.213496       1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system.  Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
W0426 12:28:39.213573       1 authentication.go:368] Error looking up in-cluster authentication configuration: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
W0426 12:28:39.213634       1 authentication.go:369] Continuing without authentication configuration. This may treat all requests as anonymous.
W0426 12:28:39.213663       1 authentication.go:370] To require authentication configuration lookup to succeed, set --authentication-tolerate-lookup-failure=false
I0426 12:28:39.243027       1 server.go:154] "Starting Kubernetes Scheduler" version="v1.28.6"
I0426 12:28:39.243268       1 server.go:156] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0426 12:28:39.245677       1 secure_serving.go:213] Serving securely on [::]:10259
I0426 12:28:39.245853       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0426 12:28:39.245915       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0426 12:28:39.245972       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0426 12:28:39.250049       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
W0426 12:28:39.250066       1 reflector.go:535] pkg/server/dynamiccertificates/configmap_cafile_content.go:206: failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot list resource "configmaps" in API group "" in the namespace "kube-system"
E0426 12:28:39.250101       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
E0426 12:28:39.250133       1 reflector.go:147] pkg/server/dynamiccertificates/configmap_cafile_content.go:206: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot list resource "configmaps" in API group "" in the namespace "kube-system"
W0426 12:28:39.253131       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
W0426 12:28:39.253450       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope
E0426 12:28:39.253593       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSINode: failed to list *v1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope
W0426 12:28:39.253458       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
E0426 12:28:39.253754       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.ReplicaSet: failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
W0426 12:28:39.253273       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
E0426 12:28:39.253807       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
W0426 12:28:39.253292       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope
E0426 12:28:39.253848       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope
W0426 12:28:39.253307       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
E0426 12:28:39.253890       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.PodDisruptionBudget: failed to list *v1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
W0426 12:28:39.253370       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
E0426 12:28:39.253928       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
W0426 12:28:39.253386       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
E0426 12:28:39.254033       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIStorageCapacity: failed to list *v1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
W0426 12:28:39.253398       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Namespace: namespaces is forbidden: User "system:kube-scheduler" cannot list resource "namespaces" in API group "" at the cluster scope
E0426 12:28:39.254071       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:kube-scheduler" cannot list resource "namespaces" in API group "" at the cluster scope
W0426 12:28:39.253458       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
E0426 12:28:39.254167       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.ReplicationController: failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
E0426 12:28:39.253473       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
W0426 12:28:39.253160       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
E0426 12:28:39.254268       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
W0426 12:28:39.253499       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
E0426 12:28:39.254315       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
W0426 12:28:39.254839       1 reflector.go:535] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
E0426 12:28:39.254868       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
system:kube-scheduler
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:kube-scheduler
  uid: 8a2cce65-9058-48a3-b12f-29bab10f403d
  resourceVersion: '103'
  creationTimestamp: '2024-04-04T11:49:51Z'
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: 'true'
  managedFields:
    - manager: kube-apiserver
      operation: Update
      apiVersion: rbac.authorization.k8s.io/v1
      time: '2024-04-04T11:49:51Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:rbac.authorization.kubernetes.io/autoupdate: {}
          f:labels:
            .: {}
            f:kubernetes.io/bootstrapping: {}
        f:rules: {}
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system:kube-scheduler
rules:
  - verbs:
      - create
      - patch
      - update
    apiGroups:
      - ''
      - events.k8s.io
    resources:
      - events
  - verbs:
      - create
    apiGroups:
      - coordination.k8s.io
    resources:
      - leases
  - verbs:
      - get
      - update
    apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    resourceNames:
      - kube-scheduler
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - nodes
  - verbs:
      - delete
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - pods
  - verbs:
      - create
    apiGroups:
      - ''
    resources:
      - bindings
      - pods/binding
  - verbs:
      - patch
      - update
    apiGroups:
      - ''
    resources:
      - pods/status
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - replicationcontrollers
      - services
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - apps
      - extensions
    resources:
      - replicasets
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - apps
    resources:
      - statefulsets
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - policy
    resources:
      - poddisruptionbudgets
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - persistentvolumeclaims
      - persistentvolumes
  - verbs:
      - create
    apiGroups:
      - authentication.k8s.io
    resources:
      - tokenreviews
  - verbs:
      - create
    apiGroups:
      - authorization.k8s.io
    resources:
      - subjectaccessreviews
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - storage.k8s.io
    resources:
      - csinodes
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - namespaces
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - storage.k8s.io
    resources:
      - csidrivers
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - storage.k8s.io
    resources:
      - csistoragecapacities

What did you expect to happen?

kube-scheduler should work properly.

How can we reproduce it (as minimally and precisely as possible)?

git checkout v2.24.1
docker pull quay.io/kubespray/kubespray:v2.24.1
docker run --rm -it  -v "(pwd)/inventory:/inventory" quay.io/kubespray/kubespray:v2.24.1 bash

ansible-playbook -i /inventory/prod/inventory.ini --diff --become cluster.yml -e kube_version=v1.28.6

OS

NAME="Red Hat Enterprise Linux"
VERSION="9.3 (Plow)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux 9.3 (Plow)"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"
HOME_URL=https://www.redhat.com/
DOCUMENTATION_URL=https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9
BUG_REPORT_URL=https://bugzilla.redhat.com/

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_BUGZILLA_PRODUCT_VERSION=9.3
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Version of Ansible

version in quay.io/kubespray/kubespray:v2.24.1

ansible [core 2.15.8]
config file = /kubespray/ansible.cfg
configured module search path = ['/kubespray/library']
ansible python module location = /usr/local/lib/python3.10/dist-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (/usr/bin/python3)
jinja version = 3.1.2
libyaml = True

Version of Python

version in quay.io/kubespray/kubespray:v2.24.1

Version of Kubespray (commit)

v2.24.1

Network plugin used

cilium

Full inventory with variables

Command used to invoke ansible

ansible-playbook -i /inventory/prod/inventory.ini --diff --become cluster.yml -e kube_version=v1.28.6

Output of ansible run

Anything else we need to know

No response

@akoken akoken added the kind/bug Categorizes issue or PR as related to a bug. label Apr 29, 2024
@hcank
Copy link

hcank commented Apr 29, 2024

I have the same error as well

@alperbasay
Copy link

I have the same error too in the same situation.

@wandersonlima
Copy link

Would you please run the following commands:

kubectl get cm extension-apiserver-authentication -n kube-system
kubectl describe role extension-apiserver-authentication-reader -n kube-system
kubectl describe rolebindings.rbac.authorization.k8s.io system::extension-apiserver-authentication-reader -n kube-system

@akoken
Copy link
Author

akoken commented Apr 29, 2024

Hi @wandersonlima

Sure! Here are the outputs:

>kubectl get cm extension-apiserver-authentication -n kube-system
NAME                                 DATA   AGE
extension-apiserver-authentication   6      25d
 
>kubectl describe role extension-apiserver-authentication-reader -n kube-system
Name:         extension-apiserver-authentication-reader
Labels:       [kubernetes.io/bootstrapping=rbac-defaults](http://kubernetes.io/bootstrapping=rbac-defaults)
Annotations:  [rbac.authorization.kubernetes.io/autoupdate:](http://rbac.authorization.kubernetes.io/autoupdate:) true
PolicyRule:
  Resources   Non-Resource URLs  Resource Names                        Verbs
  ---------   -----------------  --------------                        -----
  configmaps  []                 [extension-apiserver-authentication]  [get list watch]
 
> kubectl describe [rolebindings.rbac.authorization.k8s.io](http://rolebindings.rbac.authorization.k8s.io/) system::extension-apiserver-authentication-reader -n kube-system
Name:         system::extension-apiserver-authentication-reader
Labels:       [kubernetes.io/bootstrapping=rbac-defaults](http://kubernetes.io/bootstrapping=rbac-defaults)
Annotations:  [rbac.authorization.kubernetes.io/autoupdate:](http://rbac.authorization.kubernetes.io/autoupdate:) true
Role:
  Kind:  Role
  Name:  extension-apiserver-authentication-reader
Subjects:
  Kind  Name                            Namespace
  ----  ----                            ---------
  User  system:kube-controller-manager
  User  system:kube-scheduler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants