Available Fencing Types and Fencing Agents for a Red Hat High-Availability Cluster
Table of contents
This article provides a three-part discussion for fencing in a Red Hat high-availability cluster, going over the major types of fencing available. A Red Hat high-availability cluster requires that you configure fencing to maintain your services and protect your data. For a full description of how integral fencing is to a cluster, see the below articles:
-
Support Policies for RHEL High Availability Clusters - General Requirements for Fencing/STONITH
-
The Importance of Fencing in a Red Hat High Availability Cluster
The below documentation can be referenced for additional stonith device specific supportability considerations:
Environment
- Red Hat Enterprise Linux 6, 7, 8 or 9 (with the High Availability Add-on)
- Pacemaker
- fence-agents* packages
Whole-node Fencing
Power Fencing
Power fencing provides a method by which a node is removed from the cluster by connecting directly to the system's management module available to the node's hardware ( i.e. IBM IMM, DELL iDrac, HP ILO, IPMI, hypervisor for VM's, etc. ), and performing a "hard" power off followed by a power on. A "hard" power off here, indicates a non-graceful shutdown of the node and immediate power down, similar to what you would see if you removed the power from a currently running server. Typical case is by connecting directly to the system's management module but can as well be any physical entity ( Reference Networked Power Distribution Unit ) that is able to remove power from a device while giving feedback after having done so.
While fencing is enabled, a cluster with a node requiring fencing will not be able to perform any further recovery actions until the fencing is confirmed as successful ( powered off ) per the feedback from the configured stonith device. This immediate shutdown process and wait for fencing confirmation is required to ensure that resources do not remain available and running on node's requiring fencing, as resources running in multiple sites when they are not supposed to can lead to data corruption or service outage due to unwanted interference between multiple instances. Further discussion on why fencing is important and what kinds of events we will fence on, is available in the Importance of Fencing documentation.
The power fencing stonith devices are generally recommended and favored over other fencing types, as this provides fully automated recovery for a required fence, without the additional setup steps required with other fencing types which tend to include many caveats ( Reference "Storage Fencing" discussion ). The below lists some of the available power fencing methods that are provided by Red Hat, and the hardware that these stonith devices are associated with. This list is not exhaustive:
Physical Servers
| Hardware Type | Available Power Stonith devices |
|---|---|
| Dell Servers | fence_idrac - Fence agent for IPMI fence_drac5 - Fence agent for Dell DRAC CMC/5 |
| IBM Blade Servers | fence_bladecenter - Fence agent for IBM BladeCenter fence_ibmblade - Fence agent for IBM BladeCenter over SNMP |
| IBM Servers | fence_imm - Fence agent for IPMI |
| HP Blade Servers | fence_hpblade - Fence agent for HP BladeSystem |
| HP Servers | fence_ilo - Fence agent for HP iLO fence_ilo_ssh - Fence agent for HP iLO over SSH |
| Cisco UCS | fence_cisco_ucs - Fence agent for Cisco UCS |
| Management Modules using Redfish API | fence_redfish - I/O Fencing agent for Redfish Most newer servers have management modules can be managed via Redfish API so this generic stonith device is usable with many hardware models. |
| Generic stonith for most hardware | fence_ipmilan - Fence agent for IPMI Most of the hardware discussed above can also use this stonith device. |
Virtualized Servers
| Virtualization Type | Available Power Stonith devices |
|---|---|
| libvirt/KVM Virtual Machines | fence_xvm - Fence agent for virtual machines fence_virt - Fence agent for virtual machines |
| Red Hat Virtualization (RHV) | fence_rhevm - Fence agent for RHEV-M REST API |
| Openstack Virtual Machines | fence_openstack - Fence agent for OpenStack's Nova service |
| Openshift Virtual Machines | fence_kubevirt - Fence agent for KubeVirt |
| Vmware | fence_vmware_rest - Fence agent for VMware REST API fence_vmware_soap - Fence agent for VMWare over SOAP API |
Cloud Services
| Cloud Provider | Available Power Stonith devices |
|---|---|
| Amazon Web Services ( AWS ) | fence_aws - Fence agent for AWS (Amazon Web Services) |
| Microsoft Azure | fence_azure_arm - Fence agent for Azure Resource Manager |
| Google Cloud Platform | fence_gce - Fence agent for GCE (Google Cloud Engine) |
| Alibaba Cloud | fence_aliyun - Fence agent for Aliyun (Aliyun Web Services) |
| IBM Cloud & Power Virtual Server | Content from cloud.ibm.com is not included.fence_ibm_powervs - Fence agent for IBM PowerVS |
Networked Power Distribution Unit ( PDU ) fencing
For servers that are attached to network controlled power supplies, stonith devices are available for shutting of power to the outlets used for the server. This is very similar to traditional to power fencing methods in that servers using PDU fencing are powered off and powered on. With these methods though, we connect directly to the power source to perform this action, instead of connecting to the server's management device.
| PDU Type | Available PDU Stonith devices |
|---|---|
| APC Networked Power Switch | fence_apc - Fence agent for APC over telnet/ssh fence_apc_snmp - Fence agent for APC, Tripplite PDU over SNMP |
| Eaton ePDU | fence_eaton_snmp - Fence agent for Eaton over SNMP |
| Emerson MPX and MPH2 | fence_emerson - Fence agent for Emerson over SNMP |
| ePowerSwitch | fence_eps - Fence agent for ePowerSwitch |
| IBM iPDU | fence_ipdu - Fence agent for iPDU over SNMP |
| WTI Network Power Switch | fence_wti - Fence agent for WTI |
Watchdog and Poison-Pill "Self-Fencing"
SBD type of fencing, differs from available "power fencing" and "resource fencing" methods. Unlike these other stonith types, which require access to an external management device to perform power reboots, or remove access from cluster resources ( with "resource fencing" ), fencing devices of this type will instead self-fence / reboot during a fencable event. Under the umbrella of SBD type fencing there are two different types of configurations available, known as "watchdog fencing" and "poison-pill" fencing, and a user can configure either of these when configuring the stonith.
Components Common to both "Watchdog" and "poison-pill" sbd types fencing
Both "watchdog fencing" and "poison-pill" are reliant on the "STONITH Block Device" or also called "Storage-Based Death" ( SBD ) service [1]. For both configuration types, this service is responsible for performing the actual reboot action ( or prevention of hardware reboots as will be discussed later ). With both config types the sbd service will start a timer which is constantly counting down following sbd service initialization, and as the cluster runs it will regularly reset this timer via the SBD service as long as the cluster doesn't indicate a need for self-fencing.
If for whatever reason the sbd service is not able to reset this timer, or in cases where the cluster indicates self-fencing is needed, then the sbd service timer will either be allowed to expire or immediately self-reboot that node using a SBD service write to /proc/sysrq-trigger followed by a reboot system call via libc will occur.
To raise reliability to a level comparable to the other fencing methods mentioned above it is very much recommended to configure sbd service to have an additional software-implemented countdown backed by hardware watchdog device. The hardware watchdog device, functions similarly to how the SBD service operates alone. This hardware device, will initiate a counter that is always counting down, and will initiate a reboot if the hardware timer is allowed to reach zero without any further software intervention needed. With this configuration the SBD service will, in addition to taking care of the software implemented countdown, be responsible for resetting the hardware watchdog timer at regular intervals. The hardware watchdog timer will then be allowed to expire and perform a hardware triggered reboot as redundancy for the software ( SBD service ) based reboot approaches.
Watchdog Fencing
With the watchdog type of SBD fencing, a fence is considered successful by pacemaker ( the part of the cluster that is healthy ) solely by the expiration of the software timer as configured with cluster property 'stonith-watchdog-timeout'.
On the node to be fenced, self-fencing will reliably trigger a reboot within this timeout by reading the need for fencing from the cluster-status, by not being able to read the cluster-status or by simply being unresponsive and thus missing to reset the hardware watchdog timer.
Poison-Pill Fencing
Poison-pill fencing simply replaces waiting for a configured timeout ( stonith-watchdog-timeout ) by using a disk-based communication method explicitly telling a node to self-fence. This is achieved by including 1-3 shared storage block devices that are used to share "messages" across cluster nodes. The block devices in this configuration are structured into message slots of each of the nodes in the cluster and when more than 1 block device is configured, this provides redundant node status reports on each block device. During regular operation, and with this configuration the sbd service will monitor these messaging slots for "fencing messages", performing a "self-fence" when such messages are observed.
With this configuration, we introduce the fence_sbd fencing agent, and the agent here calls the SBD binary which will update the message slots on the shared block devices ( listed under device= option ) and allow pacemaker to communicate with the sbd service on the target node. The 'fence_sbd' stonith device, will in turn declare a fence operation as successful if it was able to write the request to a majority of storage devices and enough time has expired that the target node either has had enough time to pick up the fencing request or timed out and rebooted trying to do so [2].
The target node is monitoring the shared devices. If it loses contact to a majority of those devices or reads a fencing-request from one of the message slots it will perform a self-fence. If the node becomes unresponsive or is blocked by trying to access the block devices the hardware watchdog device will again assure the node is brought down within the timeout expected by the fencing side.
With the basic principle of poison-pill fencing as lined out above, a single shared disk would become a single point of failure. In the real implementation seeing a peer node alive (via pacemaker messaging) would prevent a node from self-fencing.
Additional Considerations with SBD self-fencing
-
SBD fencing leverages hardware watchdog devices to perform self-fencing for unresponsive or unhealthy nodes. With power fencing if all nodes are unresponsive, then fencing will not be possible as there would be no nodes to initiate the stonith action.
-
For clusters spread over 3 sites, resource recovery when network connection to one of the sites is lost may be impossible with power-fencing resources, since typically we lose connection to the fencing-device as well.
-
Watchdog fencing requires at least 3 quorum voters to function. The primary use case for
poison-pillfencing is to allow for a two-node cluster configuration and a single shared block device, and avoid this SBD limitation for thewatchdog typeconfiguration -
As an alternative you can add a qdevice to a cluster with a watchdog type configuration, in order to avoid this minimum quorum vote limitation.
-
With SBD fencing, a hardware watchdog device is recommended for use with this type of fencing. There is a software watchdog solution (
softdogkernel driver ) which offers similar capabilities, but has additional limitations. While the software watchdog may be useful in certain situations, a hardware watchdog is highly recommended as it is not susceptible to these same issues and limitations.
Related Information:
-
Exploring RHEL High Availability's Components - sbd and fence_sbd
-
Administrative Procedures for RHEL High Availability clusters - Enabling sbd fencing in RHEL 7 and 8
Footnote:
[1]: "Storage" or "block" devices were historically required with the use of SBD service, but now are only used with "poison-pill" type fencing.
[2]: Behavior using multiple disks is not working as intended at time of writing ( April 5th, 2024 ), due to bug This content is not included.RHEL-13088 or This content is not included.Bug 2033671. If further information is needed on this issue and the status, please open a support case with Red Hat.
Resource Fencing
Storage Fencing
Another way to protect your cluster during a fencing event is via a "Storage Fencing" method. These types of fencing focus on protecting the cluster health by immediately removing access to any storage devices which are attached cluster wide, thus preventing access from these fenced nodes to any data available across the cluster. Storage methods of fencing have the following caveats:
-
By default most storage fencing methods do not additionally provide "reboot" options, and as a result require manual intervention in order for fenced nodes to rejoin the cluster following a fence with the default configuration ( manual reboot is required ).
- Watchdog services can additionally be added to allow for more automated reboots and recovery ( Reference Storage stonith devices table ).
-
These storage types of stonith devices are focused on protecting the storage and data. Resources that do not depend on the storage used for fencing are not protected between the time the storage access is removed and the time the fence is confirmed.
- For example, with these fencing methods resources dependent on storage availability ( i.e.
LVM,LVM-activate,Filesystem, etc. ) will be protected by the removed storage access following a storage fence, but other resources not dependent on the storage ( i.e. systemd service or application resources not operating on fenced storage,IPaddr2, and many others ) will have the same unrestricted access even after the fence is triggered and confirmed. - This would mean that these independent resources will remain active for some time after the actual fence. Either until any configured watchdog services kick in to reboot the fenced node or the cluster services shut down following the fence ( when
watchdogservices are not configured ). This unrestricted access from the unprotected resources can lead to additional undefined or unpredictable issues dependent on the resources that remain running, although such issues are rare. - The addition of the
watchdogservice can help to prevent requiring manual intervention during a fence, but it does not prevent this time gap where some resources will remain unrestricted, meaning this split-brain potential still exists even with thewatchdogservice.
- For example, with these fencing methods resources dependent on storage availability ( i.e.
-
In addition to the above, storage types of stonith can be more difficult to maintain overtime, as any added or removed cluster managed storage should be updated accordingly each time in the stonith configuration.
Due to this added complexity in maintenance, additional steps required during setup when automated reboot/recovery is wanted, and the fact that non-storage dependent resources can remain unprotected briefly even after successful fencing, it is recommended to favor power fencing methods. Storage fencing methods should be used when power methods are not available or as a secondary redundant fencing method, and only with an understanding of the protections that this provides in your cluster layout. This helps to ensure a more reliable and consistently operational stonith device in your environment without the additional considerations required from relying purely on storage fencing ( Additional discussion in Choosing an appropriate stonith type section ).
The below table goes over the available storage fencing methods and available watchdog services for automated reboot with these fencing methods:
Storage stonith devices
| Storage Type | Available Storage Stonith devices | Does this allow automated reboot? |
|---|---|---|
| iSCSI Storage | fence_scsi - Fence agent for SCSI persistent reservation | By default, automated reboots and node recovery are not available via this stonith device. Additional reboot actions are possible through additional configuration of the watchdog service. |
| Multi-path Storage ( mpath ) | fence_mpath - Fence agent for multipath persistent reservation | By default, automated reboots and node recovery are not available via this stonith device. Additional reboot actions are possible through additional configuration of the watchdog service. |
For information on the considerations of power fencing and storage fencing, see Planning Fence Configuration.
Network Fencing
There are several stonith devices which effectively work as a storage fencing method, but function by removing storage access through switching off the storage devices relevant networking port from the network switch ( instead of removing storage reservations ). These types of storage methods do not provide power fencing and just remove storage access by removing the storage connection to the server. Following a fence of this type a manual reboot would be required to recover the fenced node.
| Switch Type | Available Fabric Stonith devices |
|---|---|
| HPE B-series Switches | fence_brocade - Fence agent for HP Brocade over telnet/ssh |
| Cisco MDS 9000 series | fence_cisco_mds - Fence agent for Cisco MDS |
| Any SNMP-capable switch | fence_ifmib - Fence agent for IF MIB |
Fencing Helpers
In addition to the "power" and "resource" fencing types discussed in the above, there also exists various other fencing "helpers" that don't actually provide fencing in terms of removing resource access, but instead modify the behaviors of existing fencing types. Several of these are discussed further below.
Server panic and vmcore capture considerations in a clustered environment
The most common one used in this fence helper category is the [fence_kdump](https://access.redhat.com/solutions/2876971) stonith device, which provides the ability to capture full vmcores in a clustered environment. When a server crashes or panics within a pacemaker cluster, a vmcore will often not be captured even with kdump configured since the vmcore dump process will be [interrupted by standard fencing](https://access.redhat.com/solutions/126863). The `fence_kdump` stonith agent introduces a method by which we check for a possible server panic and kdump activity before initiating a panic so as not to interrupt the vmcore dump process.
For fence_kdump to trigger, the kdump crash recovery service executes the kdump crash kernel when a node fails. The fencing agent listens for a message on the failed node that indicates when the failed node is executing the kdump crash kernel. Once confirmed that the kdump crash kernel is running and dumping a vmcore, the stonith device then allows the kdump crash recovery service to complete without being preempted by traditional fencing methods. The fence_kdump agent is not a replacement for fencing, but must be used in conjunction with other fence methods.
It is not necessary to implement fence_kdump to fence a node, but if your system crashes or panics, kdump will provide a record that you can use to determine its cause. Without kdump, you may not know that a system panicked, as it may have rebooted without dumping a core file. Please note, that kdump imposes an additional timeout in a fencing situation. You need to balance the amount of time it takes for kdump to create a core file with the amount of time your system can tolerate before fencing a problematic node.
General Issues and Additional Considerations
Choosing an appropriate stonith type
The general recommendation is to implement a "power fencing" method where possible, and to use storage fencing methods ( or other types ) as a second level fencing method which provides redundancy. Power fencing methods provide the following advantages over storage fencing:
- More automated recovery from fences compared to storage fencing methods, which require
watchdogservices for reboot recovery or manual reboots when not configured. - Immediate confirmation of fencing once it occurs.
- All resources are protected due to the removal of power.
- Simpler to setup.
- Less complexity in maintenance and upkeep over time.
So these power types are better for both performing failover and fences, as as well as ensuring the fenced node recovers into the cluster to take over again when needed. If power stonith solutions do not exist for your environment then a storage fencing method can be used, or storage methods can be used as a redundant fence source.
Fencing Topologies
In each cluster it is possible to configure one stonith device per node, or multiple stonith devices ( usually of different stonith types ). When configuring multiple stonith devices of different types "stonith levels", can additionally be configured to determine which stonith devices are ran against a node requiring fencing in which order. When configuring these "fencing topologies" all lower level ( level 1 ) stonith devices are attempted first, and if those fail we then try the stonith devices on the next highest configured level. It is possible to place 1 or multiple stonith devices on each stonith level.
Below are some possible scenarios were modifying the Fencing topology may be useful or required:
- These fencing levels and topology configuration can be used for configuring redundant fencing devices ( primary and secondary stonith in case primary fails ).
- The topology configuration can be useful when multiple stonith devices or types are required to ensure fencing of all resources is achieved.
- For example nPDU power fencing using redundant power supplies may require multiple stonith devices on level 1 to ensure that all power is removed.
- Configuring multiple stonith levels or multiple devices on a single level is often required with configuration of the available fence helpers.
The below documentation goes into further details on configuring these levels and fencing topologies:
Redundant Fencing Layout
For a more robust cluster configuration consider including a backup fencing device by adding a "level 2" stonith device ( or higher ) in the fencing topology. By setting a level 2 stonith device on a different network than the primary fencing device, the backup stonith device configured can kick in to perform the fencing in any cases where the primary fencing device is failing allowing for an increased safety margin.
For the purpose of this redundancy, you can consider configuring a storage fencing stonith device as your backup device. Since stonith of this type, will not be dependent on the same network connectivity as power fencing methods this option may work at times when power fencing may otherwise fail. If networked PDU's are also available in the environment then this can also be a good secondary fence option, especially if these are configured on a separate network to the primary stonith device. For further information on updating the topology and adding secondary ( redundant ) stonith devices, please reference the fencing topology section.
Additional Power Fencing considerations:
-
With many of the "power fencing" stonith devices, the
systemd-logindservice can intercept the power signal sent to the server from the system's management module. When this occurs a "graceful shutdown" may be initiated instead of the expected reboot. When the "Power key pressed" message is observed on a fence, followed by a graceful shutdown, it is recommended to update thesystemd-logindservice so that these shutdown signals are no longer intercepted:
Additional Storage Fencing considerations:
-
The storage device utilized with the
fence_scsiandfence_mpathstonith solutions must be SPC-3 compliant in order to add and remove the reservations required for fencing. Using devices which are not SPC-3 compliant can lead to failures for this stonith type: