k8s_info: wait_condition doesn't work if multiple nodes are not available #697

geetikakay · 2024-04-15T22:06:34Z

SUMMARY

K8s_info module has wait_condition which works as expected when all nodes are up but when nodes are down and we sould like to wait , it doesn't work as expected.
$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-1 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-2 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
worker-0-5fpfk NotReady worker 6d3h v1.28.7+f1b5f6c
worker-0-c694j Ready worker 6d3h v1.28.7+f1b5f6c
worker-0-r99dx NotReady worker 6d3h v1.28.7+f1b5f6c


- name: Wait for each node to be ready
kubernetes.core.k8s_info:
kind: Node
wait: yes
wait_condition:
reason: KubeletReady
type: Ready
status: True
wait_timeout: 30
wait_sleep: 10

The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 211, in main
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 173, in execute_module
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/service.py", line 304, in find
raise CoreException(
ansible_collections.kubernetes.core.plugins.module_utils.k8s.exceptions.CoreException: Failed to gather information about Node(s) even after waiting for 30 seconds
fatal: [127.0.0.1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"api_key": null,
"api_version": "v1",
"ca_cert": null,
"client_cert": null,
"client_key": null,
"context": null,
"field_selectors": [],
"host": null,
"impersonate_groups": null,
"impersonate_user": null,
"kind": "Node",
"kubeconfig": null,
"label_selectors": [],
"name": null,
"namespace": null,
"no_proxy": null,
"password": null,
"persist_config": null,
"proxy": null,
"proxy_headers": null,
"username": null,
"validate_certs": null,
"wait": true,
"wait_condition": {
"reason": "KubeletReady",
"status": true,
"type": "Ready"
},
"wait_sleep": 10,
"wait_timeout": 30
}
},
"msg": "Failed to gather information about Node(s) even after waiting for 30 seconds"
}

ISSUE TYPE

Bug Report

COMPONENT NAME

kubernetes.core.k8s_info

ANSIBLE VERSION

$ ansible --version
ansible [core 2.15.9]
  config file = None
  configured module search path = ['/home/cloud-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/cloud-user/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/cloud-user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/cloud-user/.local/bin/ansible
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/usr/bin/python)
  jinja version = 3.1.3
  libyaml = True

COLLECTION VERSION

$ ansible-galaxy --version
ansible-galaxy [core 2.15.9]
  config file = None
  configured module search path = ['/home/cloud-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/cloud-user/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/cloud-user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/cloud-user/.local/bin/ansible-galaxy
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/usr/bin/python)
  jinja version = 3.1.3
  libyaml = True

$ ansible-galaxy collection list  kubernetes.core

# /home/cloud-user/.ansible/collections/ansible_collections
Collection      Version
--------------- -------
kubernetes.core 3.0.1  

# /home/cloud-user/.local/lib/python3.9/site-packages/ansible_collections
Collection      Version
--------------- -------
kubernetes.core 2.4.0

CONFIGURATION

$ ansible-config dump --only-changed
CONFIG_FILE() = None

OS / ENVIRONMENT

$ cat /etc/os-release
NAME="CentOS Stream"
VERSION="9"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="9"
PLATFORM_ID="platform:el9"
PRETTY_NAME="CentOS Stream 9"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:centos:centos:9"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

STEPS TO REPRODUCE

Make sure more than two nodes are down
$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-1 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-2 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
worker-0-5fpfk NotReady worker 6d3h v1.28.7+f1b5f6c
worker-0-c694j Ready worker 6d3h v1.28.7+f1b5f6c
worker-0-r99dx NotReady worker 6d3h v1.28.7+f1b5f6c


- name: Wait for each node to be ready
kubernetes.core.k8s_info:
kind: Node
wait: yes
wait_condition:
reason: KubeletReady
type: Ready
status: True
wait_timeout: 30
wait_sleep: 10

- name: Wait for each node to be ready
  kubernetes.core.k8s_info:
    kind: Node
    wait: yes
    wait_condition:
      reason: KubeletReady
      type: Ready
      status: True
    wait_timeout: 3
    wait_sleep: 1
  ignore_errors: True

EXPECTED RESULTS

Expected to work and collect some useful information instead of showing "msg": "Failed to gather information about Node(s) even after waiting for 3 seconds"

ACTUAL RESULTS

I tried with longer wait time but it never collects node information and never populates which node is not ready or any useful data

The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 223, in main
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 183, in execute_module
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/service.py", line 326, in find
raise CoreException(
ansible_collections.kubernetes.core.plugins.module_utils.k8s.exceptions.CoreException: Failed to gather information about Node(s) even after waiting for 3 seconds
fatal: [127.0.0.1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"api_key": null,
"api_version": "v1",
"ca_cert": null,
"client_cert": null,
"client_key": null,
"context": null,
"field_selectors": [],
"hidden_fields": null,
"host": null,
"impersonate_groups": null,
"impersonate_user": null,
"kind": "Node",
"kubeconfig": null,
"label_selectors": [],
"name": null,
"namespace": null,
"no_proxy": null,
"password": null,
"persist_config": null,
"proxy": null,
"proxy_headers": null,
"username": null,
"validate_certs": null,
"wait": true,
"wait_condition": {
"reason": "KubeletReady",
"status": true,
"type": "Ready"
},
"wait_sleep": 1,
"wait_timeout": 3
}
},
"msg": "Failed to gather information about Node(s) even after waiting for 3 seconds"
}
...ignoring

The text was updated successfully, but these errors were encountered:

gravesm added the type/enhancement New feature or request label Apr 18, 2024

geetikakay mentioned this issue Apr 25, 2024

sap_hypervisor_node_preconfigure: replace use of oc with kubernetes.core.k8s sap-linuxlab/community.sap_infrastructure#7

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s_info: wait_condition doesn't work if multiple nodes are not available #697

k8s_info: wait_condition doesn't work if multiple nodes are not available #697

geetikakay commented Apr 15, 2024

k8s_info: wait_condition doesn't work if multiple nodes are not available #697

k8s_info: wait_condition doesn't work if multiple nodes are not available #697

Comments

geetikakay commented Apr 15, 2024

SUMMARY

ISSUE TYPE

COMPONENT NAME

ANSIBLE VERSION

COLLECTION VERSION

CONFIGURATION

OS / ENVIRONMENT

STEPS TO REPRODUCE

EXPECTED RESULTS

ACTUAL RESULTS