A volume group managed by an LVM-activate resource is active on multiple nodes in a Pacemaker cluster
Environment
- Red Hat Enterprise Linux 8, 9 (with the High Availability Add-on)
Issue
- A volume group managed by an
LVM-activateresource is active onnode1even though it hasnode2's system ID attached to it. It should not be allowed to activate onnode1. - A cluster-managed volume group gets activated at boot time by the
lvm2-pvscanservice. - After Pacemaker started on a node that had just been fenced, the scheduler logged a message like the following for an
LVM-activateresource withvg_access_mode=system_id:
Mar 18 14:56:13 node1 pacemaker-schedulerd[4757]: error: Resource halvm is active on 2 nodes (attempting recovery)
Resolution
Add all local (i.e., non-cluster-managed) volume groups to the auto_activation_volume_list parameter in the activation section of /etc/lvm/lvm.conf. For example, if there are two local volume groups named rhel and local_vg, and there are two cluster-managed volume groups named clus_vg1 and clus_vg2, then the auto_activation_volume_list parameter should look like this:
auto_activation_volume_list = [ "rhel", "local_vg" ]
If there are no non-cluster-managed volume groups, then auto_activation_volume_list should be explicitly set to an empty list, as shown below.
auto_activation_volume_list = [ ]
Then rebuild the initramfs. It is recommended that the cluster nodes are rebooted after rebuilding the initramfs to verify that only local volumes are activated.
For more information then see: Configuring and managing logical volumes Red Hat Enterprise Linux 8 | 14.1. Controlling autoactivation of logical volumes.
Related articles
- What is a Highly Available LVM (HA-LVM) configuration and how do I implement it?
- How to configure HA-LVM Cluster using system_id in RHEL 8 ?
- How to configure HA-LVM Cluster using tagging & volume_list in RHEL 8 ?
- How to configure
auto_activation_volume_listin lvm.conf file to activate the specific VGs. - Configuring and managing logical volumes Red Hat Enterprise Linux 8 | 14.1. Controlling autoactivation of logical volumes
- Chapter 5. Configuring an active/passive Apache HTTP server in a Red Hat High Availability cluster Red Hat Enterprise Linux 8
- Chapter 5. Configuring an active/passive Apache HTTP server in a Red Hat High Availability cluster Red Hat Enterprise Linux 9
Root Cause
When a RHEL system boots, the lvm2-pvscan systemd service runs pvscan --cache -aay. This automatically activates volume groups. From the pvscan(8) man page:
When the --cache and -aay options are used, pvscan records which PVs are available on the system, and activates LVs in completed VGs. A VG is complete when pvscan sees that the final PV in the VG has appeared. This is used by event-based system startup (systemd, udev) to activate LVs.
...
pvscan --cache
This first clears all existing PV online records, then scans all devices on the system, adding PV online records for any PVs that are found.
pvscan --cache -aay
This begins by performing the same steps as pvscan --cache. Afterward, it activates LVs in any complete VGs.
...
Auto-activation of VGs or LVs can be enabled/disabled using:
lvm.conf(5) activation/auto_activation_volume_list
For more information, see:
lvmconfig --withcomments activation/auto_activation_volume_list
To disable auto-activation, explicitly set this list to an empty list, i.e. auto_activation_volume_list = [ ].
When this setting is undefined (e.g. commented), then all LVs are auto-activated.
An LVM-activate resource, when configured with the option vg_access_mode=system_id, manages a shared volume group or logical volume in an active/passive manner by setting the volume's LVM systemid to the systemid of the node where that volume should be active.
When a node that's running an LVM-activate resource reboots, it runs pvscan --cache -aay as described above. If no other node has recovered the resource and assigned its own systemid to the volume, then the volume still belongs to the rebooted node, and the rebooted node activates it.
This is logically incorrect behavior, since only Pacemaker should activate or deactivate the volume. However, it doesn't cause a problem in and of itself.
The problem arises if another node (e.g., node2) starts the LVM-activate resource that manages the volume, after the rebooted node (e.g., node1) has already activated the volume. At that point, the volume is active on both nodes but has node2's systemid attached to it. This places the volume in a position such that its data is vulnerable to corruption.
The issue has been observed when fencing takes a long time to return a successful result. In this scenario, node2 fenced node1, which was running the LVM-activate resource. node1 rebooted and activated the managed volume during boot-up, while node1's systemid was still attached to the volume. A few seconds later, the fence action was declared a success, and node2 was allowed to recover the LVM-activate resource. node2 then started the resource and assigned its own systemid to the volume. At that point, the volume was active on both nodes with node2's systemid.
The solution is to configure the auto_activation_volume_list parameter in /etc/lvm/lvm.conf, and to exclude from the list all volumes that are managed by LVM-activate resources with vg_access_mode=system_id. That way, only Pacemaker activates the volumes; they cannot be activated automatically at boot or by LVM commands with the -aay options.
Diagnostic Steps
-
Run
lvson all cluster nodes and check whether the cluster-managed active/passive volume is active on multiple nodes. -
Check the
auto_activation_volume_listparameter in theactivationsection of/etc/lvm/lvm.confand confirm that either the parameter is not configured or it contains the cluster-managed volume. -
Check
/var/log/messagesand determine that the volume was activated automatically during boot.Mar 18 18:34:49 node1 lvm[1678]: pvscan[1678] PV /dev/sdb1 online, VG KBP is complete. Mar 18 18:34:49 node1 lvm[1678]: pvscan[1678] VG KBP run autoactivation. ... Mar 18 18:34:50 node1 lvm[1678]: 1 logical volume(s) in volume group "KBP" now active
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.