Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s_info: wait_condition doesn't work if multiple nodes are not available #697

Open
geetikakay opened this issue Apr 15, 2024 · 0 comments
Labels
type/enhancement New feature or request

Comments

@geetikakay
Copy link

SUMMARY

K8s_info module has wait_condition which works as expected when all nodes are up but when nodes are down and we sould like to wait , it doesn't work as expected.
$ oc get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-1 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
master-2 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
worker-0-5fpfk NotReady worker 6d3h v1.28.7+f1b5f6c
worker-0-c694j Ready worker 6d3h v1.28.7+f1b5f6c
worker-0-r99dx NotReady worker 6d3h v1.28.7+f1b5f6c

- name: Wait for each node to be ready kubernetes.core.k8s_info: kind: Node wait: yes wait_condition: reason: KubeletReady type: Ready status: True wait_timeout: 30 wait_sleep: 10

The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 211, in main
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 173, in execute_module
File "/tmp/ansible_kubernetes.core.k8s_info_payload_grhofqoz/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/service.py", line 304, in find
raise CoreException(
ansible_collections.kubernetes.core.plugins.module_utils.k8s.exceptions.CoreException: Failed to gather information about Node(s) even after waiting for 30 seconds
fatal: [127.0.0.1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"api_key": null,
"api_version": "v1",
"ca_cert": null,
"client_cert": null,
"client_key": null,
"context": null,
"field_selectors": [],
"host": null,
"impersonate_groups": null,
"impersonate_user": null,
"kind": "Node",
"kubeconfig": null,
"label_selectors": [],
"name": null,
"namespace": null,
"no_proxy": null,
"password": null,
"persist_config": null,
"proxy": null,
"proxy_headers": null,
"username": null,
"validate_certs": null,
"wait": true,
"wait_condition": {
"reason": "KubeletReady",
"status": true,
"type": "Ready"
},
"wait_sleep": 10,
"wait_timeout": 30
}
},
"msg": "Failed to gather information about Node(s) even after waiting for 30 seconds"
}

ISSUE TYPE
  • Bug Report
COMPONENT NAME

kubernetes.core.k8s_info

ANSIBLE VERSION
$ ansible --version
ansible [core 2.15.9]
  config file = None
  configured module search path = ['/home/cloud-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/cloud-user/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/cloud-user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/cloud-user/.local/bin/ansible
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/usr/bin/python)
  jinja version = 3.1.3
  libyaml = True
COLLECTION VERSION
$ ansible-galaxy --version
ansible-galaxy [core 2.15.9]
  config file = None
  configured module search path = ['/home/cloud-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/cloud-user/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/cloud-user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/cloud-user/.local/bin/ansible-galaxy
  python version = 3.9.18 (main, Jan  4 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/usr/bin/python)
  jinja version = 3.1.3
  libyaml = True

$ ansible-galaxy collection list  kubernetes.core

# /home/cloud-user/.ansible/collections/ansible_collections
Collection      Version
--------------- -------
kubernetes.core 3.0.1  

# /home/cloud-user/.local/lib/python3.9/site-packages/ansible_collections
Collection      Version
--------------- -------
kubernetes.core 2.4.0  

CONFIGURATION
$ ansible-config dump --only-changed
CONFIG_FILE() = None
OS / ENVIRONMENT

$ cat /etc/os-release
NAME="CentOS Stream"
VERSION="9"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="9"
PLATFORM_ID="platform:el9"
PRETTY_NAME="CentOS Stream 9"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:centos:centos:9"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

STEPS TO REPRODUCE
  1. Make sure more than two nodes are down
    $ oc get nodes
    NAME STATUS ROLES AGE VERSION
    master-0 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
    master-1 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
    master-2 Ready control-plane,master 6d4h v1.28.7+f1b5f6c
    worker-0-5fpfk NotReady worker 6d3h v1.28.7+f1b5f6c
    worker-0-c694j Ready worker 6d3h v1.28.7+f1b5f6c
    worker-0-r99dx NotReady worker 6d3h v1.28.7+f1b5f6c
- name: Wait for each node to be ready kubernetes.core.k8s_info: kind: Node wait: yes wait_condition: reason: KubeletReady type: Ready status: True wait_timeout: 30 wait_sleep: 10
- name: Wait for each node to be ready
  kubernetes.core.k8s_info:
    kind: Node
    wait: yes
    wait_condition:
      reason: KubeletReady
      type: Ready
      status: True
    wait_timeout: 3
    wait_sleep: 1
  ignore_errors: True
EXPECTED RESULTS

Expected to work and collect some useful information instead of showing "msg": "Failed to gather information about Node(s) even after waiting for 3 seconds"

ACTUAL RESULTS

I tried with longer wait time but it never collects node information and never populates which node is not ready or any useful data

The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 223, in main
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_info.py", line 183, in execute_module
File "/tmp/ansible_kubernetes.core.k8s_info_payload_5469sl6c/ansible_kubernetes.core.k8s_info_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/service.py", line 326, in find
raise CoreException(
ansible_collections.kubernetes.core.plugins.module_utils.k8s.exceptions.CoreException: Failed to gather information about Node(s) even after waiting for 3 seconds
fatal: [127.0.0.1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"api_key": null,
"api_version": "v1",
"ca_cert": null,
"client_cert": null,
"client_key": null,
"context": null,
"field_selectors": [],
"hidden_fields": null,
"host": null,
"impersonate_groups": null,
"impersonate_user": null,
"kind": "Node",
"kubeconfig": null,
"label_selectors": [],
"name": null,
"namespace": null,
"no_proxy": null,
"password": null,
"persist_config": null,
"proxy": null,
"proxy_headers": null,
"username": null,
"validate_certs": null,
"wait": true,
"wait_condition": {
"reason": "KubeletReady",
"status": true,
"type": "Ready"
},
"wait_sleep": 1,
"wait_timeout": 3
}
},
"msg": "Failed to gather information about Node(s) even after waiting for 3 seconds"
}
...ignoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants