Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't reconnect after reboot #361

Open
olwins opened this issue Apr 8, 2024 · 2 comments
Open

can't reconnect after reboot #361

olwins opened this issue Apr 8, 2024 · 2 comments

Comments

@olwins
Copy link

olwins commented Apr 8, 2024

Hi

I have a playbook that patch a remote server, it work without issue when started manually using ansible-playbook.

But when running with rundesk on the same server , the playbook hang in the reboot task each time

Ansible playbook (this task is enough to reproduce the problem)

- name: Reboot the server 
  ansible.builtin.reboot:
    msg: "Reboot initiated by Ansible for linux patching"
    connect_timeout: 20 
    reboot_timeout: 900
    pre_reboot_delay: 10
    post_reboot_delay: 30
    test_command: uptime

It look like, it not able to properly reconnect after the reboot

ansible.builtin.reboot: attempting to get system boot time
sending connection check: [b'ssh', b'-C', b'-o', b'ControlMaster=auto', b'-o', b'ControlPersist=60s', b'-o', b'IdentityFile="/tmp/rundeck/ansible-runner1205336585210018520id_rsa"', b'-o', b'KbdInteractiveAuthentication=no', b'-o', b'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', b'-o', b'PasswordAuthentication=no', b'-o', b'User="itansible"', b'-o', b'ConnectTimeout=10', b'-o', b'StrictHostKeyChecking=accept-new', b'-o', b'ServerAliveInterval=30', b'-o', b'ControlPath="/var/lib/rundeck/.ansible/cp/f4829f47ff"', b'-O', b'check', b'testserver']
No connection to reset: Control socket connect(/var/lib/rundeck/.ansible/cp/f4829f47ff): No such file or directory
<testserver> ESTABLISH SSH CONNECTION FOR USER: itansible
<testserver> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/tmp/rundeck/ansible-runner1205336585210018520id_rsa"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="itansible"' -o ConnectTimeout=10 -o StrictHostKeyChecking=accept-new -o ServerAliveInterval=30 -o 'ControlPath="/var/lib/rundeck/.ansible/cp/f4829f47ff"' -tt testserver '/bin/sh -c '"'"'sudo -H -S -p "[sudo via ansible, key=ylstsxfvxidltvukzexjnjqasejqauun] password:" -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-ylstsxfvxidltvukzexjnjqasejqauun ; cat /proc/sys/kernel/random/boot_id'"'"'"'"'"'"'"'"' && sleep 0'"'"''
<testserver> (255, b'', b'itansible@testserver: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).\r\n')

It retry every 30/40 sec, always with the same error
after a while, there is only one additional line, the socket seems to be removed also :
o connection to reset: Control socket connect(/var/lib/rundeck/.ansible/cp/f4829f47ff): No such file or directory

in rundeck, ansible is configured to use a ssh key + passphrase (in the vault), and a root password also in the vault

project.ansible-become-method=sudo
project.ansible-become-password-storage-path=keys/project/TEST_PATCHING_LINUX/password-itansible-root
project.ansible-become=true
project.ansible-binaries-dir-path=/opt/ansible/.venv/bin
project.ansible-config-file-path=/opt/ansible/ansible.cfg
project.ansible-executable=/bin/bash
project.ansible-generate-inventory=true
project.ansible-ssh-auth-type=privateKey
project.ansible-ssh-keypath=/var/lib/rundeck/.ssh/id_ed25519
project.ansible-ssh-passphrase-option=option.password
project.ansible-ssh-passphrase-storage-path=keys/project/TEST_PATCHING_LINUX/Pass_itmasteransible
project.ansible-ssh-use-agent=true
project.ansible-ssh-user=itansible

I try to modify a few ssh settings, but it didn't change anything

@olwins
Copy link
Author

olwins commented Apr 8, 2024

Edit : It work if I redefine all project variable at the job level

may be something is lost during the retry ?

Job hung

"configuration" : {
"ansible-base-dir-path" : "/opt/ansible",
"ansible-become" : "true",
"ansible-become-method" : "sudo",
"ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root",
"ansible-playbook" : "run_patching.yml",
"ansible-ssh-passphrase-option" : "option.password",
"ansible-ssh-use-agent" : "false"
},

Job succeeded (basically I set manually the same value that the one define at the project level):

"configuration" : {
"ansible-base-dir-path" : "/opt/ansible",
"ansible-become" : "true",
"ansible-become-method" : "sudo",
"ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root",
"ansible-playbook" : "run_patching.yml",
"ansible-ssh-auth-type" : "privateKey",
"ansible-ssh-keypath" : "/var/lib/rundeck/.ssh/id_ed25519",
"ansible-ssh-passphrase-option" : "option.password",
"ansible-ssh-passphrase-storage-path" : "keys/project/TEST_PATCHING_LINUX/Pass_itmasteransible",
"ansible-ssh-use-agent" : "true",
"ansible-ssh-user" : "itansible"
},

@olwins
Copy link
Author

olwins commented Apr 8, 2024

root cause found

I thought that by default the ansible-ssh-use-agent value would be set to the one defined at the project level (true in my case)
But when I create a new job, it is automatically set to false

    "ansible-ssh-use-agent" : "false"

Set those value for the job are enough 👍

"ansible-base-dir-path" : "/opt/ansible",
"ansible-become" : "true",
"ansible-become-password-storage-path" : "keys/project/TEST_PATCHING_LINUX/password-itansible-root",
"ansible-playbook" : "test_patching.yaml",
"ansible-ssh-passphrase-option" : "option.password",
"ansible-ssh-use-agent" : "true"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant