How to configure stonith agent `fence_xvm` in pacemaker cluster when cluster nodes are KVM guests and are on different KVM hosts.

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux Server 6 (with the High Availability Add on)
  • Red Hat Enterprise Linux Server 7 or 8 (with the High Availability Add on)
    • pacemaker cluster

Issue

  • How to configure stonith agent fence_xvm in pacemaker cluster when cluster nodes are KVM guests and are on different KVM hosts.
  • I have cluster nodes as VM running on different KVM hosts. How to configure fence agent fence_xvm for the setup.

Resolution

Configuration on KVM hosts
1. Install packages

# yum install fence-virt fence-virtd fence-virtd-libvirt fence-virtd-multicast

2. Create fence key on each KVM host

# mkdir -p /etc/cluster
# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4k count=1

3. Rename the fence key on each KVM host
For example on host1, the command would be

 # mv /etc/cluster/fence_xvm.key /etc/cluster/fence_xvm_host1.key

For host2, the command would be

 # mv /etc/cluster/fence_xvm.key /etc/cluster/fence_xvm_host2.key

4. Copy the fence key from each KVM host to all the nodes (VMs) of the cluster.
For example, in a 7 node cluster, with 7 VMs (cluster nodes on 7 different hosts), there will be 7 keys in each VM (each node of the cluster)

 # scp /etc/cluster/fence_xvm_host$i.key node$i:/etc/cluster

It is important that you have different key per physical host.

5. Configure fence_virtd daemon on each KVM host

 # fence_virtd -c

At the prompts use the following values:

    accept default search path
    accept multicast as default
    accept default multicast address
    accept default multicast port
    set interface to br0 (replace the bridge name with the one configured on your hosts)
    accept default fence_xvm.key path
    set backend module to libvirt
    accept default URI
    enter "y" to write config

This would create a /etc/fence_virt.conf file with the following content:

fence_virtd {
    module_path = "/usr/lib64/fence-virt";
    listener = "multicast";
    backend = "libvirt";
}

listeners {
    multicast {
        key_file = "/etc/cluster/fence_xvm.key";
        interface = "br0";
        port = "1229";
        address = "225.0.0.12";
        family = "ipv4";
    }
}

backends {
    libvirt {
        uri = "qemu:///system";
    }

}

The assumption here is that br0 is the bridge that will be used for communication between the host and the VMs. If you plan to use any other bridge, update the same in the above file.

Change key_file = "/etc/cluster/fence_xvm.key" for each host as per step 3 above.
For example, on host1, it would be

listeners {
    multicast {
        key_file = "/etc/cluster/fence_xvm_host1.key";

and so on...
NOTE: We are using default Multicast address (i.e. 225.0.0.12) for one of the hosts. This could be changed from default and it is recommended that each host will have different multicast address.
Change multicast address for one KVM host as per below example:

 address = "225.0.0.12"

to

 address = "225.0.$i.12"
where,
      replace $i with a value based on the subnet being used

Also, change the multicast address for the other KVM host as per below example:

 address = "225.0.0.12"

to

 address = "224.0.$i.1"
where,
      replace $i with a value based on the subnet being used

6. Start fence_virtd daemon on all the hosts

 # service fence_virtd start                    <------------- RHEL6
 # systemctl start fence_virtd.service  <------------ RHEL7

To start the service across reboots

 # chkconfig fence_virtd on                     <------------- RHEL6
 # systemctl enable fence_virtd.service  <------------ RHEL7

7. On virtual machine cluster nodes, install fence-virt package

 # yum install fence-virt

8. Try to get the list of VMs and the status of each node, from other nodes and the KVM hosts:
NOTE: If the fence_xvm command is unable to list VMs (running on separate KVM hosts) and reports "Timed out waiting for response" message, then refer the solution as per link

 # fence_xvm -a 225.0.$i.12 -k /etc/cluster/fence_xvm_host1.key -o list
 # fence_xvm -a 224.0.$i.1 -k /etc/cluster/fence_xvm_host2.key -o list

 # fence_xvm -a 225.0.$i.12 -k /etc/cluster/fence_xvm_host1.key -H guest$x -o status
 # fence_xvm -a 224.0.$i.1 -k /etc/cluster/fence_xvm_host2.key -H guest$y -o status
where, 
      replace $i with a value based on the subnet being used
      replace $x with the VM name 1
      replace $y with the VM name 2 

9. To test fencing, use below example command:

From guest VM 'y', run below command
 # fence_xvm -o reboot -a 225.0.$i.12 -k /etc/cluster/fence_xvm_host1.key -H guest$x  

From guest VM 'x', run below command
 # fence_xvm -o reboot -a 224.0.$i.1 -k /etc/cluster/fence_xvm_host2.key -H guest$y    

node5 should reboot here.
10. The "domain" name is actually the name of the virtual machine in kvm, not the hostname or dns domain name of the cluster node. Whilst they might happen to be the same the distinction is important, because if your machine is named differently in KVM to its hostname and you attempt to use the hostname in fencing, fencing would not work.

"virsh list" command can be used to list the virtual machine names:

[on host1]# virsh list 
 Id Name                 State 
---------------------------------- 
 1  node1               running 

11. Assuming that the node is rebooting immediately, proceed with configuring fencing devices

 # pcs stonith create xvmfence1 fence_xvm pcmk_host_list="node{$x}"  multicast_address="225.0.{$i}.12" key_file="/etc/cluster/fence_xvm_host1.key"

 # pcs stonith create xvmfence2 fence_xvm pcmk_host_list="node{$y}"  multicast_address="224.0.{$i}.1" key_file="/etc/cluster/fence_xvm_host2.key"
where, 
      replace $i with a value based on the subnet being used
      replace $x with the pacemaker node name for VM name 1
      replace $y with the pacemaker node name for VM name 2 

12. Before testing the fencing configuration, please check if the cluster property stonith-enabled=true is set or not ?

Run below command from any one cluster node
# pcs property list --all | grep "stonith-enabled"

If this cluster property is not set, then set the cluster property stonith-enabled=true using below command:

Run below command from any one cluster node
# pcs property set stonith-enabled=true

Then, test the fencing configuration

# fence_node <node_name>
 or
# pcs stonith fence <node_name>
 or
# stonith_admin --reboot <node_name>
SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.