[Satellite 6]Running ansible job on multiple hosts succeeds while on single host fails
Environment
- Satellite 6.5
- Satellite Capsule
- Ansible
Issue
A non load-balanced Ansible job chooses first capsule in the list when no capsule is selected for remote execution within a subnet of the target host in Satellite. The Ansible job gets executed over capsule:
Ansible job fails on:
1:
PLAY [all] *********************************************************************
2:
3:
TASK [Gathering Facts] *********************************************************
4:
fatal: [ansible.host.test]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Could not create directory '/usr/share/foreman-proxy/.ssh'.\r\nFailed to add the host to the list of known hosts (/usr/share/foreman-proxy/.ssh/known_hosts).\r\nno such identity: /usr/share/foreman-proxy/.ssh/id_rsa_foreman_proxy: No such file or directory\r\nPermission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}
5:
to retry, use: --limit @/tmp/foreman-playbook-5b88b4ed-5857-49e1-bdbc-d47c8a1fb411.retry
6:
7:
PLAY RECAP *********************************************************************
8:
ansible.host.test : ok=0 changed=0 unreachable=1 failed=0
9:
Exit status: 4
Resolution
Enable remote execution feature on capsule
satellite-installer --scenario capsule --enable-foreman-proxy-plugin-remote-execution-ssh
Or select capsule for remote execution
Infrastructure -> subnets -> select subnet -> Remote Execution tab -> choose capsule/capsules from the list -> submit
For more KB articles/solutions related to Red Hat Satellite 6.x Remote Execution Issues, please refer to the Red Hat Satellite Consolidated Troubleshooting Article for Red Hat Satellite 6.x Remote Execution Issues
Root Cause
- Remote execution was not enabled on capsule
- No capsule was selected for remote execution in subnet
- No subnet selected for host
- If no subnet is selected or selected capsules list within subnet is empty, ansible execution always takes first capsule from the list of available capsules. It does not check for enabled features
- If search query results in multiple hosts, ansible jobs got load balanced between capsules. Again, when none capsule/subnet is selected, it load balances job to all available capsules, and may succeed if remote execution feature is enabled on one of them (and if that capsule has network connection to host)
- Fact, that ansible chooses capsule which do not have remote execution enabled, is a bug This content is not included.bugzilla
Diagnostic Steps
Using search query resulting in multiple hosts -> load-balanced
|\n date\n register: out\n - debug: var=out", "execution_timeout_interval"=>nil, "connection_options"=>{"retry_interval"=>15, "retry_count"=>4}, "proxy_url"=>"https://satellite.ansible.test:9090", "proxy_action_name"=>"ForemanRemoteExecutionCore::Actions::RunScript", "locale"=>"en", "current_user_id"=>3}
Using search query resulting in single host -> non load-balaced
|\n date\n register: out\n - debug: var=out", "execution_timeout_interval"=>nil, "connection_options"=>{"retry_interval"=>15, "retry_count"=>4}, "proxy_url"=>"https://capsule.ansible.test:9090", "proxy_action_name"=>"ForemanRemoteExecutionCore::Actions::RunScript", "locale"=>"en", "current_user_id"=>3}
Notice the proxy_url value set on the failure host which is set to "capsule.ansible.test"
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.