What format should I use to specify node mappings to stonith devices in pcmk_host_list and pcmk_host_map in a RHEL High Availability pacemaker cluster?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 6 , 7, 8 or 9 (with the High Availability Add-on)
  • pacemaker
  • One or more stonith devices

Issue

  • Should I specify the hostname or IP address in pcmk_host_list (or pcmk_host_map)?
  • Should I specify my nodes using FQDN or shortname in pcmk_host_list (or pcmk_host_map)?
  • What format should the nodes be listed in in the pcmk_host_list or pcmk_host_map stonith device attribute?
  • Does stonith expect nodes to be listed by nodename or hostname or IP in pcmk_host_list and pcmk_host_map?

Resolution

The name(s) provided to pcmk_host_list and pcmk_host_map should match what pcs status lists the nodes as.

# pcs status
[root@cs-rh7-1 ~]# pcs status
Cluster name: cs-rh7-cluster
Last updated: Mon Sep 12 12:32:03 2016          Last change: Thu Sep  8 17:07:02 2016 by root via cibadmin on cs-rh7-1-clust.examplerh.com
Stack: corosync
Current DC: cs-rh7-2-clust.examplerh.com (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 17 resources configured

Online: [ cs-rh7-1-clust.examplerh.com cs-rh7-2-clust.examplerh.com ]  
[...]

In this example, the names used should be cs-rh7-1-clust.examplerh.com and cs-rh7-2-clust.examplerh.com.

pcmk_host_list format


This attribute takes a list of nodes separated by space, comma, or semi-colon. The names should exactly match what `pacemaker` refers to them as, which is derived from the base configuration at `/etc/corosync/corosync.conf` (RHEL 7) or `/etc/cluster/cluster.conf` (RHEL 6), which would also be reflected in `pcs status` output.

Single-node-per-device example:

# pcs stonith create node1-ipmi fence_ipmilan ipaddr=192.168.10.11 login=Admin passwd='myPassword' lanplus=1 pcmk_host_list="cs-rh7-1-clust.examplerh.com"

Multiple-nodes-per-device example:

# pcs stonith create scsifence fence_scsi pcmk_host_list="cs-rh7-1-clust.examplerh.com,cs-rh7-2-clust.examplerh.com"

pcmk_host_map format


This attribute takes a list of map entries separated by a space or semi-colon. Each entry should be of the form "<node_name>:<port/vm/list_name>;<node_name>:<port/vm/list_name>", where the "node name" is the name which is reported by `pcs status` and is derived from the base cluster configuration (see "`pcmk_host_list` format" above). The "port/vm/list name" is the string by which that device refers to each node, as seen in the agent's `-o list` output. Each entry is thus mapping the cluster's known name for a node - the node name - to the device's known name for it - the list name. These entries can then be separated by comma or semi-colon.

This map can also accomodate setups where a node name may map to multiple different "ports" or entries on a single device. For example, if there is an APC power switch with a node connected to two separate ports, then the fence_apc stonith device would need to be told what label is applied for that node's ports so that it can find those ports and interact with them. To convey such information, separate the individual ports for a given node with commas.

Single-port-per-node example where the APC has these nodes; ports labeled "rh7-nodeX":

# pcs stonith create apcfence fence_apc ipaddr=192.168.10.12 login=Admin passwd='myPassword' secure=1  pcmk_host_map="cs-rh7-1-clust.examplerh.com:rh7-node1;cs-rh7-2-clust.examplerh.com:rh7-node2"

Multiple-ports-per-node example where the APC has these nodes' ports labeled "rh7-nodeX-portY":

# pcs stonith create apcfence fence_apc ipaddr=192.168.10.12 login=Admin passwd='myPassword' secure=1  pcmk_host_map="cs-rh7-1-clust.examplerh.com:rh7-node1-port1,rh7-node1-port2;cs-rh7-2-clust.examplerh.com:rh7-node2-port1,rh7-node2-port2"

pcmk_host_map format using special characters


With the release of the errata [RHBA-2022:1885](/errata/RHBA-2022:1885) with the following package(s): `pacemaker-2.1.2-4.el8` or later, the pacemaker property `pcmk_host_map` [allows](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/8.6_release_notes/index#enhancement_high-availability-and-clusters) *"special ASCII characters"* for the value within the key/value pairs. For example, you can specify `pcmk_host_map="node3:plug\ 1"` to include a space character in the host alias, so that the space character will not be treated as a delimiter. For more information see: [Space in the VM names on VMware Hypervisor causes `fence_vmware_soap` to fail with "Unable to obtain correct plug status or plug is not available"\.](/solutions/5277821)
  • The "special characters" that can be escaped out for pcmk_host_map are limited to ASCII characters only. For example, UTF-8 characters cannot be escaped properly.
  • Only the value in the key/value pairs for pcmk_host_map can have escaped ASCII character. They key in the key/value pair cannot contain escaped characters.

Single-port-per-node example where the APC has these nodes; ports labeled which includes a space "rh7 nodeX":

# pcs stonith create apcfence fence_apc ipaddr=192.168.10.12 login=Admin passwd='myPassword' secure=1  pcmk_host_map="cs-rh7-1-clust.examplerh.com:rh7\ node1;cs-rh7-2-clust.examplerh.com:rh7\ node2"

Updating pcmk_host_list and

pcmk_host_map
In order to modify/update an existing value of pcmk_host_map or pcmk_host_list for a configured fence resource(s):

 - For pcmk_host_map:
        # pcs stonith update <stonith ID> pcmk_host_map="<cluster-nodename1>:<VM-Name>;<cluster-nodename2>:<VM-Name>"

 - For pcmk_host_list:
        # pcs stonith update <stonith ID> pcmk_host_list="<cluster-nodename1>"

Root Cause

When cluster triggers a fence, it reached to stonith daemon to fence the node. Stonith daemon looks for a fence resource which is capable to fence the cluster node. In order to search for the right fence resource which can fence the node, it reviews the pcmk_host_list or pcmk_host_map attributes and match for the cluster node name. If the cluster node name does not matches any of the fence resource configured, cluster will log a failure & fence operation will not reboot the node.

The first attribute in pcmk_host_map is the cluster node name as seen in /etc/corosync/corosync.conf file and the next attribute i.e. post semicolon is the server name as seen in the hypervisor.

    # cat /etc/corosync/corosync.conf
    [...]
    nodelist {
        node {
            ring0_addr: node1  <=== Cluster node name
            nodeid: 1
        }

        node {
            ring0_addr: node2  <=== Cluster node name
            nodeid: 2
        }
    }

Similarly the pcmk_host_list attribute should only contain the cluster nodename as shown above.
The cluster node name can also be validated from the cib.xml file. The value to uname parameter after executing the following command is the cluster nodename:

# pcs cluster cib | grep "node id"
SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.