Troubleshooting OpenShift Container Platform: OpenShift Ansible Playbooks
Environment
- Openshift Container Platform (OCP) 3.9 and Below
Issue
- Openshift 3 Install failed how do I recover?
- Error when installing
FATAL: all hosts have already failed -- aborting - Can I rerun the Openshift installer?
Diagnostic Steps
-
Before rerunning the Openshift installer, please check your Openshift version:
OCP 3.7or below: The This page is not included, but the link has been rewritten to point to the nearest parent document.Openshift Installer can be rerun any time after a failed or clean install.OCP 3.9or greater: The Openshift Installer has some known issues:- If you did not modify the SDN configuration or generate new certificates, run the deploy_cluster.yml playbook again.
- If you modified the SDN configuration, generated new certificates, or the installer fails again, you must either start over with a clean operating system installation or uninstall and install again.
- If you use virtual machines, start from a fresh image or uninstall and install again.
- If you use bare metal machines, uninstall and install again.
NOTE: Please, make sure that you are using the OpenShift packages from Red Hat repositories, rather than pulling the playbooks from github.
-
When rerunning the install, it is recommended to save the output to a file to review any errors:
# ansible-playbook <PLAYBOOK> -vvv | tee ansible.logs -
The Openshift installer will skip over install steps that have already been installed or configured. To start over on a clean slate, remove all components from all hosts by running the following:
# ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml -
When installing OpenShift on AWS, OpenStack, or an environment with a external firewall, make sure security groups are configured to allow access network access from all hosts in your environment.
-
Run the Openshift facts playbook to ensure the Hostnames and IP addresses for your hosts are correct.
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift_facts.yml-
If the Hostnames and IP addresses are incorrect when facts playbook is run, define the variables for master and nodes in the
/etc/ansible/hostfile. (OpenShift 3.9 and below)openshift_ip openshift_public_ip openshift_hostname openshift_public_hostname
-
-
If failing on a certain task, access the github repository for the installer to view what is being done during the task:
TASK: [openshift_manage_node | Wait for Node Registration] stderr: Error from server: node "node1.example.com" not found msg: Task failed as maximum retries was encountered FATAL: all hosts have already failed -- aborting- Access the Git Hub Install Repo: Content from github.com is not included.openshift/openshift-ansible
- Content from github.com is not included.Search for the Task that failed
- Look for the Content from github.com is not included.commands
Log a This content is not included.case if unable to troubleshoot issue
- Include the following in the case:
-
# sosreport -e docker -k docker.all=on -
Copy of your
/etc/ansible/hostsfile -
Openshift Installer complete output
-
Openshift facts playbook output
# ansible all -m ping -vvvv # ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift_facts.yml > openshift_facts.log # ansible-playbook ~/usr/share/ansible/openshift-ansible/playbooks/byo/config.yml | tee openshift_install.log
-
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.