Stonith device fence_cisco_ucs fails while communicating with UCS Blade
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add-On
pacemaker- Cisco UCS Blade
Issue
- Stonith device
fence_cisco_ucsagent fails during testing. - pcs status shows "unknown error"
- The command
pcs stonith fence <nodename>showsCommand failed: No route to host. Whenfence_cisco_ucsis tested manually the actionsstatusandlistwork successfully.
Resolution
Ensure the below procedures were performed on the UCS manager
- Login to UCS manager as a user that has
Adminprivileges. - Click on the
admin taband select user management from the drop down menu. - Expand user service option from the left column and click on local authenticated users.
- Select the user you have created for ucs fencing.
- In the general tab for the user make sure user has below roles assigned
- Admin
- Server-equipment
Then manual test that you can perform the following actions with the fence_cisco_ucs fencing agent: status, list, on, off, reboot.
After making the recommended changes in the "Resolution" section on the UCS Blade, verify changes work by fencing an opposite node using two different methods:
# fence_cisco_ucs --ip="X.X.X.X" --username="<username>" --passwd="<password>" -z 1 --plug="UCSPROFILE2" --suborg="/org-RHEL/" -o reboot -vvv
If those are successfully then verify that the fencing agent is properly configured in pacemaker and manual call stonith on a cluster node to see if the cluster node is successfully fenced off.
# pcs stonith fence <nodename>
Root Cause
The user that was configured to do the fencing on UCS fence device did not have the correct privileges set.
Diagnostic Steps
The command pcs status displays the below errors in "Failed Actions":
Failed Actions:
* fence_ucs_start_0 on <nodename> 'unknown error' (1): call=38, status=Error, exitreason='none',
last-rc-change='Wed Dec 7 15:34:00 2016', queued=0ms, exec=1098ms
* fence_ucs_start_0 on <nodename> 'unknown error' (1): call=102, status=Error, exitreason='none',
last-rc-change='Wed Dec 7 15:33:58 2016', queued=0ms, exec=1093ms
After trying to fence a node, does it print the following error: "No route to host"
# pcs stonith fence <nodename>
Error: unable to fence <nodename>
Command failed: No route to host
Manually test the fencing agent fence_cisco_ucs to verify the following actions work successfully: status, list, on, off, reboot. If they do then likely a configuration issue within pacemaker. If they do not then likely a configuration issue on the fence device or one of the parameters.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.