Running a fence agent from the command line works, but fence_node fails to fence a node in a RHEL High Availability cluster
Environment
- Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
Issue
- Running
fence_ipmilanfrom the command line works, butfence_nodeor having the cluster try to fence a node automatically fails - It's sucessful to use the fence agent directly on one node to fence the other. But when use
fence_nodecommand to fence the other, it fails.
Resolution
-
Determine if the parameters passed on the command line exactly match those in
/etc/cluster/cluster.conf- NOTE: Extra attention may be needed to ensure special characters are escaped on the command line, but not in the configuration
-
Ensure there is a valid
fencedeviceconfigured in/etc/cluster/cluster.confand that the node in question has a validdevicereference. For example:<clusternodes> <clusternode name="node1.example.com" nodeid="1" votes="1"> <fence> <method name="1"> <device name="node1-iLO"/> </method> </fence> </clusternode> [...] </clusternodes> <fencedevices> <fencedevice agent="fence_ipmilan" power_wait="10" auth="password" ipaddr="192.168.2.10" lanplus="1" login="Administrator" name="node1-iLO" passwd="myPasswd"/> [...] </fencedevices> [...]Note: The "name" attribute in the
deviceelement must match afencedevicewith the same "name" specified. These values are case-sensitive. -
Ensure the attributes specified in the
fencedeviceanddeviceelements in/etc/cluster/cluster.confare valid. See the manpage for that specific fence agent for details on valid attributes. -
Check
/var/log/audit/audit.logorsetroubleshootdto determine if SELinux may have blocked execution of the agent in any way.
Root Cause
The most common cause of fencing via the agent directly from the command line working but fence_node not working is a misconfiguration in /etc/cluster/cluster.conf. It is very important when this happens that the settings be studied very closely for any potential mismatches. The one exception is with special characters, which may need escaping on the command line, but should not be escaped in the configuration.
If the issue persists, in some cases where SELinux is in enforcing mode, an operation carried out by the agent may be getting blocked.
Nearly all issues of this nature boil down to one of the above conditions, so it is strongly recommended that a focus be placed on investigating those two items.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.