[RHV] The hosted-engine deploy (restore-from-file) fails if any non-management logical network is defined as a required in backup file.
Environment
- Red Hat Virtualization 4.x
Issue
-
The hosted-engine deployment fails with the below error:-
2019-03-07 20:33:50,711+0530 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"}
Resolution
-
This issue was resolved with a fix in
ovirt-ansible-hosted-engine-setup-1.0.21. Upgrade the RHV hypervisor having theovirt-ansible-hosted-engine-setuppackage at version greater than or at1.0.21which is included in4.3.5hypervisors. After upgrading, the setup will prompt to pause the deployment if an answer is provided asyesto the question below.Pause the execution after adding this host to the engine? You will be able to iteratively connect to the restored engine in order to manually review and remediate its configuration before proceeding with the deployment:\nplease ensure that all the datacenter hosts and storage domain are listed as up or in maintenance mode before proceeding. This is normally not required when restoring an up to date and coherent backup. -
The GUI can be accessed manually at this stage and the required networks can be configured for this host or a user can mark it as
not required. -
If upgrading is not possible, a workaround is to create hook
fix_networkinenginevm_after_engine_setupbefore deploying theSelf-Hosted Engineenvironment with a backup file:- -
For RHV 4.2 host, the fix_network hook path is at
/usr/share/ovirt-hosted-engine-setup/ansible/hooks/enginevm_after_engine_setup/fix_network.yml. -
For RHV 4.3 host, the fix_network hook path is at
/usr/share/ansible/roles/ovirt.hosted-engine-setup/hooks/enginevm_after_engine_setup/fix_network.yml. -
For RHV 4.4 SP1 host, the fix_network hook path is at
/usr/share/ansible/collections/ansible_collections/redhat/rhv/roles/hosted_engine_setup/hooks/enginevm_after_engine_setup/fix_network.yml. -
Add the below content in fix_network.yml, replace required_network with the actual required network which is missing causing the host to go non-operational, also replace the data_center and cluster names with the actual names provided in the deployment.
- include_tasks: auth_sso.yml - name: Wait for the engine to reach a stable condition wait_for: timeout=300 - name: fix network ovirt_network: auth: "{{ ovirt_auth }}" name: "{{ item }}" data_center: Default clusters: - name: Default required: False with_items: - "require_network_1" - "require_network_2" -
The play will run after the engine-setup, wait for 5 minutes for the engine to initialize and disable the
requiredparameter from the networks mentioned so that host will not gonon_operational.
Note therequire_network_namemust have a double quote"around the name to prevent Ansible from removing special characters like_.
Root Cause
This issue is tracked in This content is not included.Bug 1686575.
Diagnostic Steps
-
From
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-*.logfile, error is as follows:-2019-03-07 20:33:50,711+0530 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"} -
From engine logs,
/var/log/ovirt-hosted-engine-setup/engine-logs-*file, error is as follows:-2019-03-07 20:33:42,342+05 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engine-Thread-16) [6fad6d2a] Host '<hostname>' is set to Non-Operational, it is missing the following networks: '<network_name>' 2019-03-07 20:33:42,397+05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-16) [6fad6d2a] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host <hostname> does not comply with the cluster Default networks, the following networks are missing on host: '<network_name>'
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.