When to use "maintenance-mode" in RHEL High Availability Add-on for pacemaker based cluster?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL) 6, 7, 8 or 9 with the High Availability Add -On
  • Pacemaker

Issue

  • Need to know recommended use of maintenance-mode in RHEL High Availability Add-on cluster in RHEL 7, 8 , 9.
  • Can I use maintenance-mode=true while performing patching ?
  • Can I set cluster to maintenance-mode while patching SAP Hana Environments?

Resolution

Maintenance mode (for a pacemaker cluster) is a property that can be enabled when you want to safely perform operations that could otherwise trigger resource migration or fencing. Once the maintenance activity is completed, the property can be unset.

Examples of when to use maintenance mode :

  • Need to perform disk maintenance on node1, which could temporarily cause the application resource to stop or fail.
  • Changing a resource's configuration (e.g., IP address, port) and want to test it without Pacemaker intervening.

What happens when you enable maintenance-mode in a pacemaker environment ?

  • Pacemaker stops monitoring resources, so failures won’t trigger recovery
  • No start, stop, promote, or demote actions will be taken by the cluster
  • Fencing (STONITH) will not trigger due to resource failures*

When should you NOT use maintenance-mode :

*Please NOTE that maintenance mode serves as a mechanism to prevent only resource-related events from causing fencing, it however does NOT prevent node-level events - such as network loss, node hung/panic, ungraceful reboot/shutdown and others - from being resolved by fencing.

To set maintenance mode on the cluster:

# pcs property set maintenance-mode=true

To unset maintenance mode on the cluster:

# pcs property unset maintenance-mode

If only one particular resource by pacemaker is affected, then in some cases setting the resource to unmanaged might be an option.

Root Cause

When the cluster is in maintenance mode, pacemaker will stop managing the resource and stop all the monitors on the running resources. This means that you can manually stop and start resources without causing pacemaker to take action.

Below is a snippet from a man page for reference:

Maintenance Mode tells the cluster to go to a `hands off` mode, and not start or stop any services until told otherwise. When maintenance-mode is completed, the cluster does a sanity check of the current state of any services, and then stops or starts any that need it. 

If you modify a resource while in maintenance mode then once you leave maintenance mode then the resource will have to be reloaded and this could cause any constraints or dependencies (if part of a group) to be stopped first.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.