Running a fence agent from the command line works, but fence_node fails to fence a node in a RHEL High Availability cluster

Solution Unverified - Updated 7 Aug 2024

Environment

Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On

Issue

Running fence_ipmilan from the command line works, but fence_node or having the cluster try to fence a node automatically fails
It's sucessful to use the fence agent directly on one node to fence the other. But when use fence_node command to fence the other, it fails.

Resolution

Determine if the parameters passed on the command line exactly match those in /etc/cluster/cluster.conf
- NOTE: Extra attention may be needed to ensure special characters are escaped on the command line, but not in the configuration

Ensure there is a valid fencedevice configured in /etc/cluster/cluster.conf and that the node in question has a valid device reference. For example:

  <clusternodes>
  <clusternode name="node1.example.com" nodeid="1" votes="1">
          <fence>
                  <method name="1">
                          <device name="node1-iLO"/>
                  </method>
          </fence>
  </clusternode>
  [...]
  </clusternodes>
  <fencedevices>
  <fencedevice agent="fence_ipmilan" power_wait="10" auth="password" ipaddr="192.168.2.10" lanplus="1" login="Administrator" name="node1-iLO" passwd="myPasswd"/>
  [...]
  </fencedevices>
  [...]

Note: The "name" attribute in the device element must match a fencedevice with the same "name" specified. These values are case-sensitive.

Ensure the attributes specified in the fencedevice and device elements in /etc/cluster/cluster.conf are valid. See the manpage for that specific fence agent for details on valid attributes.
Check /var/log/audit/audit.log or setroubleshootd to determine if SELinux may have blocked execution of the agent in any way.

Root Cause

The most common cause of fencing via the agent directly from the command line working but fence_node not working is a misconfiguration in /etc/cluster/cluster.conf. It is very important when this happens that the settings be studied very closely for any potential mismatches. The one exception is with special characters, which may need escaping on the command line, but should not be escaped in the configuration.

If the issue persists, in some cases where SELinux is in enforcing mode, an operation carried out by the agent may be getting blocked.

Nearly all issues of this nature boil down to one of the above conditions, so it is strongly recommended that a focus be placed on investigating those two items.

SBR

Clusterha

Product(s)

Red Hat Enterprise Linux

Components

cluster

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.