RHEL-OSP controllers are unexpectedly rebooted or fenced when using High Availability with DHCP addresses
Environment
- Red Hat Enterprise Linux Openstack Platform (RHEL-OSP)
- Red Hat Enterprise Linux with the High Availability Add On
- Providing High Availability for the RHEL-OSP Controllers
- RHEL-OSP Controllers configured to obtain addresses via DHCP on the network used for cluster communication
Issue
- HA Controller nodes on my RHEL-OSP deployment are getting randomly fenced by pacemaker. How can I make sure that I have configured my cluster properly to avoid problems like this?
- The controller nodes in my RHEL-OSP cluster are occasionally fenced.
Resolution
-
Configure the controller nodes with static IP addresses on the network used for cluster communications. In order to apply this change, the entire cluster will need to be stopped, the the node's network interface(s) must be restarted to apply the addressing changes, and then the cluster can be started once again.
- Make sure that the cluster interconnect network is not shared with the provisioning network. Provisioning networks always uses DHCP for IP address assignment where as other network roles can use statically assigned IP address.
-
If fencing of nodes continues after switching to static IP addressing, then further investigation or changes may be required to address other potential causes of fencing.
Root Cause
This can happen if cluster management network is configured to obtain IP address via DHCP. Configuring cluster management network that carries corosync traffic via DHCP is not supported, for this reason. If a cluster node is being fenced frequently when using DHCP addressing, then switching to static addressing is the best step to try first.
If this does not help, then there are a number other reasons why a High Availability cluster nodes may be fenced, so further investigation may be necessary.
Diagnostic Steps
-
Look for signs in the logs that a node is receiving a new address or having a lease renewed by DHCP just before it is fenced.
-
Review and carry out the standard diagnostic steps for investigating fence events
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.