RHV: Non-Operational Hosts after Restoring Backup or Migrating to Self-Hosted Engine

Solution Verified - Updated

Environment

  • Bare-Metal Red Hat Virtualization 4.x, 3.6
    • Upgraded from 3.5 or earlier.
    • Migrated to a Self Hosted-Engine(HE) environment

Issue

  • After restoring an upgraded RHEV environment into a brand new 3.6 RHEV environment, the host(s) is being shown as non-operational with rhevm logical network missing
  • Migrated bare-to-Self Hosted-Engine from a bare metal RHEV environment originated on version (3.5 or before), won't list the HE host as Up
  • Management network still called rhevm in original environment.

Resolution

There are two options available:

Option 1:

*Use this solution if this is not Hosted Engine migration or if this is HEmigration, but the original environment we are restoring from is not available anymore.*
Rename to `ovirtmgmt` network bridge: For this you'll need to bring all VMs/Hosts down/maintenance, change the Management Network name from `rhevm` to `ovirtmgmt`; add vNICs to VMs; modify hosts' networks to propagate the change to host OS and create `rhevm` bridge on each host.
Option 2

1\. In the original environment, modify `rhevm` logical network to be configured as a VM network.
Run `engine-backup` command again and use the new backup file for restore.
In case `rhevm` logical network is already VM network, skip this step.

2.
Re-deploy 3.6 HE setup with rhevm logical network by passing an answering file which contains the rhevm bridge Network

# cat /tmp/answers-rhevm.conf
[environment:default]
OVEHOSTED_NETWORK/bridgeName=str:rhevm
# hosted-engine --deploy --config-append=/tmp/answers-rhevm.conf

If RHEV-H, access a console in RHEV-TUI (F2) and deploy the HE VM from CLI.

Note: There's an RFE requesting to allow users to set the bridge name while installing the Self Hosted-Engine environment -> This content is not included.1231799. If you would like to weigh in this RFE, open a support case and request adding it to this RFE.

Root Cause

Starting RHEV 3.6 the management network is called ovirtmgmt. However, in previous releases it was called rhevm. That's why when an environment that has rhevm logical network is restored into a brand new 3.6 environment, that is based ovirtmgmt logical network and ovirtmgmt bridge is created on the hosts, the restored RHEV-Manager is failing to find rhevm network on the hosts and marks the hosts as non-operational. However the network is still operational, since the underlying networking was not modified and is operational, using the ovirtmgmt bridge on the hosts and the same nic on the Manager machine.

There is a Documentation bug open in order to provide more accurate steps for environments such as this.
This content is not included.[Docs] Add some notes regarding HE networking to SHE Migration Guide

Diagnostic Steps

In recently deployed host (using ovirtmgmt interface name), vdsm fails to find the interface:

Thread-4930425::ERROR::2016-08-17 06:03:36,125::migration::209::virt.vm::(_recover) vmId=`dccc3f8b-e65b-45bd-8354-16e1175f23b2`::Cannot get interface MTU on 'rhevm': No such device
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.