Using SCSI Persistent Reservation Fencing (fence_scsi) with pacemaker in a Red Hat High Availability cluster

Updated

Issue

  • How do I configure fence_scsi in Pacemaker clusters?
  • Are any special settings required for fence_scsi when using Pacemaker?

Environment

Resolution

The general principles behind SCSI Persistent Reservation fencing in Pacemaker are the same as for standard cman clusters, but the specific method for implementing fence_scsi fencing with pacemaker's stonith-ng is slightly different. It is recommended that the previously linked article is reviewed first for an overview of the general concepts before proceeding with the following instructions.

RHEL 6, 7, 8 or 9 pacemaker cluster

Configure device in stonith-ng
After starting the pacemaker service, configure a new fence_scsi device using pcs command below :

# pcs stonith create scsi fence_scsi pcmk_host_list="node1.example.com node2.example.com" pcmk_reboot_action="off" meta provides="unfencing" --force

When using CLVMD in cluster, fence_scsi will automatically detect which devices to manage, by checking which are the physical volumes in volume groups marked with the "clustered" attribute. If using tagging variant of the HA-LVM or if there is need to specify a different set of devices that should be fenced by fence_scsi, use the command below to specify the device list manually. When specifying multiple devices separate them by comma ,:

# pcs stonith create scsi fence_scsi pcmk_host_list="node1.example.com node2.example.com" pcmk_reboot_action="off" devices="/dev/mapper/mpathb,/dev/mapper/mpathc" meta provides="unfencing" --force

It is also possible to change list after it is already created with "update" :

# pcs stonith update scsi devices="/dev/mapper/mpathb,/dev/mapper/mpathc"

NOTE: When specifying values for the devices attribute, the only acceptable values are paths to the physical devices that comprise your shared volume groups. It is incorrect to specify local physical volumes (such as those comprising your root volume group), or to specify LVM logical volumes or LVM volume groups. In addition, if a multipathing solution (device-mapper-multipath, emcpower, etc) is used then the path to those devices should be used.

When using lvmlockd in the cluster setup, fence_scsi needs to be configured with the devices parameter unlike RHEL 7 cluster utilizing CLVMD which can automatically detect which devices to manage.

RHEL 6 cman based cluster

Configure fence_pcmk device in /etc/cluster/cluster.conf

NOTE: RHEL 7 and newer does not use cman, so this section can be skipped for such environments.

Because cman and fenced are still in use in pacemaker clusters in RHEL 6, fenced will need to know what devices to use to fence a removed node. The primary device in /etc/cluster/cluster.conf should use the fence_pcmk agent, which is an interface for fenced to interact with stonith-ng. There is no need for an "unfence" device, as stonith-ng will automatically handle the "on" action when a node starts.

pcs offers a quick and easy way to configure cluster.conf for pacemaker, and should automatically configure the devices correctly:

# pcs cluster setup --name example node1.example.com node2.example.com

This would result in a configuration such as:

<?xml version="1.0"?>
<cluster config_version="201311011439" name="example">
	<clusternodes>
		<clusternode name="node1.example.com" nodeid="1">
			<fence>
				<method name="pcmk-method">
					<device name="pcmk-redirect" port="node1.example.com"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="node2.example.com" nodeid="2">
			<fence>
				<method name="pcmk-method">
					<device name="pcmk-redirect" port="node2.example.com"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="1" two_node="1"/>
	<fencedevices>
		<fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
	</fencedevices>
	<rm>
		<failoverdomains/>
		<resources/>
	</rm>
</cluster>

Nothing else is needed in cluster.conf to enable fence_scsi. The remaining configuration will be done through pcs. The above configuration can be synced to all nodes using ccs_sync.

Test fencing

NOTE: This will cut off access to storage from the affected nodes. This should be done during an outage window only.

Check the reservation/registration status of a device managed by fence_scsi using sg_persist. For example:

# sg_persist -n -i -r -d /dev/mapper/mpathb
  PR generation=0xa0, Reservation follows:
    Key=0xf1070001
    scope: LU_SCOPE,  type: Write Exclusive, registrants only
# sg_persist -n -i -k -d /dev/mapper/mpathb
  PR generation=0xa0, 8 registered reservation keys follow:
    0xf1070002
    0xf1070002
    0xf1070001
    0xf1070001
    0xf1070001
    0xf1070001
    0xf1070002
    0xf1070002

We should see a registration keys for both nodes matching the number of paths to the device (or 1 if its not a multipath device), and a reservation corresponding to one of those keys. If there are no reservations or registrations or they are not set up correctly, it means the unfence operation failed or some part of the configuration is incorrect. Check /var/log/cluster/corosync.log for messages from stonith-ng for clues.

If unfencing did succeed and all nodes are registered, create a situation that will simulate a node failure, such as by panicking the system with SysRq+C or using one of the methods to simulate cluster network failure. Watch /var/log/messages for signs of success or failure:

Nov 21 12:28:52 node1 corosync[3128]:   [TOTEM ] A processor failed, forming new configuration.
Nov 21 12:28:54 node1 corosync[3128]:   [QUORUM] Members[1]: 1
Nov 21 12:28:54 node1 corosync[3128]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Nov 21 12:28:54 node1 crmd[3490]:   notice: crm_update_peer_state: cman_event_callback: Node node2.example.com[2] - state is now lost (was member)
Nov 21 12:28:54 node1 crmd[3490]:  warning: reap_dead_nodes: Our DC node (node2.example.com) left the cluster
Nov 21 12:28:54 node1 crmd[3490]:   notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=reap_dead_nodes ]
Nov 21 12:28:54 node1 corosync[3128]:   [CPG   ] chosen downlist: sender r(0) ip(xxx.xxx.xxx.xxx) ; members(old:2 left:1)
Nov 21 12:28:54 node1 corosync[3128]:   [MAIN  ] Completed service synchronization, ready to provide service.
Nov 21 12:28:54 node1 fenced[3194]: fencing node node2.example.com
Nov 21 12:28:54 node1 kernel: dlm: closing connection to node 2
Nov 21 12:28:54 node1 fence_pcmk[4552]: Requesting Pacemaker fence node2.example.com (reset)
Nov 21 12:28:54 node1 stonith_admin[4553]:   notice: crm_log_args: Invoked: stonith_admin --reboot node2.example.com --tolerance 5s --tag cman 
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: handle_request: Client stonith_admin.cman.4553.d7e6e17b wants to fence (reboot) 'node2.example.com' with device '(any)'
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: initiate_remote_stonith_op: Initiating remote operation reboot for node2.example.com: f503618f-d79e-429a-88eb-f9868aec24f9 (0)
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: can_fence_host_with_device: scsi can fence node2.example.com: static-list
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: can_fence_host_with_device: scsi can fence node2.example.com: static-list
Nov 21 12:28:54 node1 crmd[3490]:   notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ]
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: log_operation: Operation 'reboot' [4555] (call 2 from stonith_admin.cman.4553) for host 'node2.example.com' with device 'scsi' returned: 0 (OK)
Nov 21 12:28:54 node1 stonith-ng[3486]:   notice: remote_op_done: Operation reboot of node2.example.com by node1.example.com for stonith_admin.cman.4553@node1.example.com.f503618f: OK
Nov 21 12:28:54 node1 fenced[3194]: fence node2.example.com success
Nov 21 12:28:55 node1 crmd[3490]:   notice: tengine_stonith_notify: Peer node2.example.com was terminated (reboot) by node1.example.com for node1.example.com: OK (ref=f503618f-d79e-429a-88eb-f9868aec24f9) by client stonith_admin.cman.4553
Nov 21 12:28:55 node1 crmd[3490]:   notice: tengine_stonith_notify: Notified CMAN that 'node2.example.com' is now fenced

Check the reservations and registrations on the devices again, confirming that those corresponding to the fenced node are now gone:

# sg_persist -n -i -r -d /dev/mapper/mpathb
  PR generation=0xa0, Reservation follows:
    Key=0xf1070002
    scope: LU_SCOPE,  type: Write Exclusive, registrants only
# sg_persist -n -i -k -d /dev/mapper/mpathb
  PR generation=0xa0, 8 registered reservation keys follow:
    0xf1070002
    0xf1070002
    0xf1070002
    0xf1070002
SBR
Category
Components
Article Type