How to configure/manage STONITH 'levels' in RHEL cluster with pacemaker?
Environment
- Red Hat Enterprise Linux (RHEL) 6
- Red Hat Enterprise Linux (RHEL) 7
- Red Hat Enterprise Linux (RHEL) 8
- Red Hat Enterprise Linux (RHEL) 9
pacemaker- STONITH levels
Issue
- How to configure more that one fence devices in a cluster?
- How to configure/manage STONITH levels in RHEL cluster with pacemaker?
- How to configure fence agent fence_ipmilan in RHEL cluster with pacemaker?
- How to configure fence agent fence_apc in RHEL cluster with pacemaker?
Resolution
In HA clustering with single STONITH device, the only STONITH device becomes a single point of failure. To address this, "fencing-topology" was added to pacemaker to configure multiple and complex fencing configurations. In pcs, this is done by configuring stonith levels for the fencing agents.
A very popular method of fencing is to use the internal fence devices like ipmi for fencing (fence_ipmilan). The IPMI draws its power from the host's power supply. One of the drawback for such fencing devices is that, should the host lose power, then the IPMI will not be able to respond to fence requests and the fence action will fail.
If the IPMI's network connection uses a single network interface, a broken or disconnected network cable, a failed switch port or switch or a failure in the NIC itself would also leave the IPMI interface inaccessible and the cluster node would not be fenced off.
The simple solution to this issue is to use a second fence method. When a cluster node needs to be stonith off (or fenced off), all the operations in level 1 are done. If there are successfully then no other levels are executed, but if that level fails it proceeds to the next level. If all level's operations are tried and all failed then it loops back to level 1 and starts all over. This will continue in this loop until one of the following occurs:
- The cluster node rejoins the cluster after a reboot has occurred.
- The cluster node was successfully fenced off from the cluster with the fence agent.
In the STONITH configuration below, two fence devices, fence_ipmilan and fence_apc are configured so that:
- The
fence_ipmilanfencing agent is tried first. - If the
fence_ipmilanfencing agent does not succeed then thefence_apcagent will be tried.
Steps to implement STONITH levels
1. Configure the two desired fence devices. In this case, we are using fence_ipmilan and fence_apc fence agents.
#pcs stonith create ipmi-fencing1 fence_ipmilan pcmk_host_list="node1.example.com" ipaddr="10.65.208.102" login=root passwd=xxx op monitor interval=30s
#pcs stonith create ipmi-fencing2 fence_ipmilan pcmk_host_list="node2.example.com" ipaddr="10.65.208.103" login=root passwd=xxx op monitor interval=30s
# pcs stonith create apc-fencing1 fence_apc pcmk_host_list="node1.example.com" ipaddr="10.65.208.31" login=root passwd=xxx port=14 action=reboot op monitor interval=30s
# pcs stonith create apc-fencing2 fence_apc pcmk_host_list="node2.example.com" ipaddr="10.65.208.31" login=root passwd=xxx port=12 action=reboot op monitor interval=30s
- The configuration would look similar to
# pcs status
Cluster name: rhel7testcluster
Last updated: Sun May 18 09:30:40 2014
Last change: Mon Apr 28 10:02:59 2014 via cibadmin on node2.example.com
Stack: corosync
Current DC: node1.example.com (1) - partition with quorum
Version: 1.1.10-29.el7-368c726
2 Nodes configured
6 Resources configured
Online: [ node1.example.com node2.example.com ]
Full list of resources:
ipmi-fencing1 (stonith:fence_ipmilan): Started node1.example.com
ipmi-fencing2 (stonith:fence_ipmilan): Started node2.example.com
apc-fencing1 (stonith:fence_apc): Started node1.example.com
apc-fencing2 (stonith:fence_apc): Started node2.example.com
PCSD Status:
node1.example.com: Online
node2.example.com: Online
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
2. Create STONITH levels
[root@node1 ~]# pcs stonith level add 1 node1.example.com ipmi-fencing1
[root@node1 ~]# pcs stonith level add 1 node2.example.com ipmi-fencing2
[root@node1 ~]# pcs stonith level add 2 node1.example.com apc-fencing1
[root@node1 ~]# pcs stonith level add 2 node2.example.com apc-fencing2
3. Check the configuration
# pcs stonith level --- Lists all of the fencing levels currently configured
Node: node1.example.com
Level 1 - ipmi-fencing1
Level 2 - apc-fencing1
Node: node2.example.com
Level 1 - ipmi-fencing2
Level 2 - apc-fencing2
Note :
# pcs stonith level clear
--- Clears the fence levels on the node (or stonith id) specified or clears
all fence levels if a node/stonith id is not specified. If more than
one stonith id is specified they must be separated by a comma and no
spaces. Example: pcs stonith level clear dev_a,dev_b
# pcs stonith level
To remove a particular device from the STONITH level
# pcs stonith level
Node: node1.example.com
Level 1 - ipmi-fencing1
Level 2 - apc-fencing1
Node: node2.example.com
Level 1 - ipmi-fencing2
Level 2 - apc-fencing2
# pcs stonith level remove 2 node1.example.com apc-fencing1 <-- removes second fencing method for node1
# pcs stonith level
Node: node1.example.com
Level 1 - ipmi-fencing1
Node: node2.example.com
Level 1 - ipmi-fencing2
Level 2 - apc-fencing2
Fencing device would still be available though
#pcs status
[.... ]
Full list of resources:
apc-fencing1 (stonith:fence_apc): Started node1.example.com
apc-fencing2 (stonith:fence_apc): Started node2.example.com
ipmi-fencing1 (stonith:fence_ipmilan): Started node1.example.com
ipmi-fencing2 (stonith:fence_ipmilan): Started node2.example.com
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.