How can I avoid a loss of quorum on remaining nodes in my cluster when I stop cman on one or more nodes in RHEL 6?
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
- NOTE: The desired behavior described in this solution is achieved automatically in RHEL 5 when issuing
service cman stop, so nothing more is required.
- NOTE: The desired behavior described in this solution is achieved automatically in RHEL 5 when issuing
- A cluster containing more than two nodes
Issue
- If I stop cluster services on several nodes in the cluster, the remaining nodes lose quorum because they don't have enough votes left. How can I stop those nodes but allow the remaining nodes to continue operating?
- Is there a way to tell
cmanto decrease quorum for the remaining nodes when one node leaves "gracefully"? - How can I stop a node to perform maintenance on it or reboot without causing a loss of quorum on other nodes?
Resolution
While stopping cluster daemons in order to remove the node from the cluster, use the following command to stop cman:
# service cman stop remove
This will cause the other nodes to automatically adjust their quorum value downwards by the number of votes this node represents, so that they do not lose quorum.
Root Cause
When stopping a node's cman service, that node will leave the membership which causes any remaining members of the cluster to have less votes. If the loss of that node's votes would cause those other nodes to lose quorum, then those nodes will stop functioning until enough votes are present to regain quorum.
For example, in a 4-node cluster, if maintenance were to be performed on nodes 3 and 4, you might stop all services including cman. When cman is stopped on node 3, nodes 1, 2, and 4 would still have 3 votes, enough to maintain quorum which requires 3 votes. Now when node 4 stops cman, nodes 1 and 2 would only have 2 votes, and this is not enough for quorum.
This behavior is often undesirable in situations where nodes are being stopped for maintenance purposes, but the remaining nodes still need to be functional. As such, the cman init script offers a method to have a node leave the cluster, but to also have the remaining nodes adjust their quorum count downwards to account for the leaving node. This is done by issuing the command 'stop remove' to the cman init script. Those nodes left in the cluster will recalculate quorum to not incorporate the leaving node's votes, which should enable them to remain quorate.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.