A stonith resource attempts to fence a cluster node while it is in stopped state on a pacemaker cluster.

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux Server 6, 7, 8 or 9 (with the High Availability Add On and Resilient Storage Add Ons)
  • pacemaker

Issue

  • A stonith resource attempts to fence a cluster node while it is in stopped state on a pacemaker cluster.

Resolution

In order to prevent a stonith device from being executed for a cluster node then the stonith device will need to be Disabled or Banned.

For example the following would disable the fencing agent from being ran.

# pcs resource disable <stonith device> 
# pcs resource ban <stonith device> <cluster node>

If the stonith device is in a Stopped state (and not Stopped(Disabled) then the stonith device will not be removed from the list of valid stonith device that can be used on a cluster node (and thus fencing will be attempted with that stonith device). The stonith device is only removed from the list of possible stonith devices when it is Disabled or Banned.

Root Cause

The stonith device was only in a "Stopped" state and was not disabled. This meant that the stonith device was still listed as possible stonith device for the cluster node and a fence attempt was executed with that stonith device.

Please note in older versions of pacemaker that disabling or banning the stonith agent still resulted in fencing. That issue is resolved with later releases (RHEL 6.8, RHEL 7.1.z).

Diagnostic Steps

Check to see if the stonith device is listed as "Stopped" in the output from pcs status.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.