How do I configure a stonith device using agent fence_vmware_soap in a Red Hat High Availability cluster with pacemaker?
Environment
- Red Hat Enterprise Linux 6, 7, 8 or 9 (with the High Availability Add-on)
- One or more nodes running as VMware guests
- Pacemaker
Issue
- How to configure stonith agent fence_vmware_soap in a RHEL cluster with pacemaker.
Resolution
For configuration of fence_vmware_soap in RGmanager-based clusters, see Solution 68064.
Assume the following about the cluster architecture
- Pacemaker node names are
node1andnode2. - VM names of the cluster nodes are
node1-vmandnode2-vm. <vCenter/ESXi IP address>is the IP address of the hypervisor or vCenter that is managing the cluster node VMs.
Manually verify that the fence agent is able to communicate with the fence device
First verify that the cluster node is able to reach the vCenter or hypervisor and list the VMs on it. The following command will try to connect to the vCenter with the provided credentials and list all machines.
# fence_vmware_soap -a <vCenter/ESXi IP address> -l <vCenter/ESXi username> -p <vCenter/ESXi password> --ssl-insecure -o list | egrep '(node1-vm|node2-vm)'
node1-vm,11111111-aaaa-bbbb-cccc-111111111111
node2-vm,22222222-dddd-eeee-ffff-222222222222
If the above list action fails, then make sure the below are true.
- The node is able to communicate with the vCenter/ESXi host on port
443/tcp(when using SSL) or on port80/tcp(without SSL). - The user has the required permissions on for fencing.
Verify that Pacemaker is able to fence each node with the stonith/fence agent.
- If command succeeded in manual getting the status or turning off/on/rebooting of the node then the VM is able to communicate with the hypervisor. Stonith device should be configured using same configuration options as were tested in listing. Some of arguments for the
fence_vmware_soapcommand andfence_vmware_soapfencing agent in pacemaker can have slightly different name! For this reason check the help pages of both -fence_vmware_soapcommand andfence_vmware_soapfencing agent (In diagnostics section is shortened listing of options used by this solution). - Create the stonith device using command below. The
pcmk_host_mapattribute is used to map each node name as seen by Pacemaker to the name of virtual machine as seen by the VMware hypervisor. (For more information on the correct format forpcmk_host_map, refer to the following solution: What format should I use to specify node mappings to stonith devices in pcmk_host_list and pcmk_host_map in a RHEL High Availability pacemaker cluster?)
RHEL7:
# pcs stonith create vmfence fence_vmware_soap pcmk_host_map="<pacemaker_nodename1>:<node1-vm>;<pacemaker_nodename2>:<node2-vm>" ipaddr=<ESXi/vCenter IP address> ssl=1 login=<esxi_username> passwd=<esxi_password>
RHEL8 or later:
# pcs stonith create vmfence fence_vmware_soap pcmk_host_map="<pacemaker_nodename1>:<node1-vm>;<pacemaker_nodename2>:<node2-vm>" ip=<ESXi/vCenter IP address> ssl=1 username=<esxi_username> password=<esxi_password>
- To check the status of stonith device and its configuration use the commands below. (Prior to RHEL 8, replace
pcs stonith statuswithpcs stonith show, and replacepcs stonith config vmfencewithpcs stonith show vmfence --full.)
# pcs stonith status
Full list of resources:
vmfence (stonith:fence_vmware_soap): Started node1
# pcs stonith config vmfence
Resource: vmfence (class=stonith type=fence_vmware_soap)
Attributes: pcmk_host_map=node1:node1-vm;node2:node2-vm ipaddr=<ESXi/vCenter IP address> ssl=1 login=<esxi_username> passwd=<esxi_password>
- When stonith device is started proceed with proper testing of fencing in the cluster.
Additional notes and recommendations
- Using two or more separate fence devices for a shared fence_vmware_soap device makes you susceptible to following issue.
- stonith-timeout doesn't work as expected in a RHEL 6 or 7 High Availability cluster with pacemaker.
- If you have a two-node cluster, consider setting a 'pcmk_delay_max' parameter to prevent fence race scenarios.
- If you suspect the connection to the hypervisor takes too long and the stonith device times out, you can always use the
timecommand before you run the fence agent manually.
Errors
* vmfence_start_0 on node1 'unknown error' (1): call=26, status=Timed Out, exitreason='none',
last-rc-change='DATE TIME', queued=0ms, exec=20026ms
- To avoid timing out after default timeout when running the fence agent manually, use one or more of the following options:
--login-timeout=[seconds],--power-timeout=[seconds],--shell-timeout=[seconds]
# time fence_vmware_soap -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl -z -o list |egrep "(node1-vm|node2-vm)"
# time fence_vmware_soap -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl --ssl-insecure -o list |egrep "(node1-vm|node2-vm)"
# time fence_vmware_soap -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl --power-timeout=120 -o list |egrep "(node1-vm|node2-vm)"
# time fence_vmware_soap -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl --ssl-insecure --power-timeout=120 -o list |egrep "(node1-vm|node2-vm)"
Diagnostic Steps
# fence_vmware_soap -h
...
Options:
-a, --ip=[ip] IP address or hostname of fencing device
-l, --username=[name] Login name
-p, --password=[password] Login password or passphrase
-S, --password-script=[script] Script to run to retrieve password
-z, --ssl Use ssl connection
--ssl-secure Use ssl connection with verifying certificate
--ssl-insecure Use ssl connection without verifying certificate
-o, --action=[action] Action: status, reboot (default), off or on
-v, --verbose Verbose mode
...
# pcs stonith describe fence_vmware_soap
...
Stonith options:
ipaddr (required): IP Address or Hostname
login (required): Login Name
passwd: Login password or passphrase
passwd_script: Script to retrieve password
ssl: SSL connection
ssl_secure: SSL connection with verifying fence device's certificate
ssl_insecure: SSL connection without verifying fence device's certificate
action (required): Fencing Action
verbose: Verbose mode
pcmk_host_map: A mapping of host names to ports numbers for devices that do
not support host names.
...
SBR
Product(s)
Components
Category
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.