Why is UDP unicast (UDPU) not recommended for use in large clusters with GFS2 in a RHEL 6 or 7 Resilient Storage cluster?
Environment
- Red Hat Enterprise Linux (RHEL) 6 Update 2 or higher or RHEL 7 with the Resilient Storage Add-on
- GFS2
- UDPU transport in use
- RHEL 6:
<cman transport="udpu"/>in `/etc/cluster/cluster.conf - RHEL 7:
totem { transport: udpu }in/etc/corosync/corosync.conf
- RHEL 6:
Issue
- IP multicast is disabled in my environment, but I want to create a GFS2 cluster.
- I want to use udpu as it is an available alternative, but it is said that it is not recommended in a cluster that uses GFS2.
Root Cause
As of the Red Hat Enterprise Linux 6 Update 2 release and the release of RHEL 7, Red Hat High-Availability Add-On nodes can communicate with each other using the UDP Unicast transport mechanism. It is recommended, however, that you use IP multicast for the cluster network. UDP unicast is an alternative that can be used when IP multicast is not available.
The recommendation to avoid UDPU is especially true when using GFS2. UDPU works by sending group messages directly to all nodes separately, rather than sending a single packet to the multicast address and allowing the network infrastructure to deliver it to subscribed members. This method has higher overhead, and thus does not scale as well with very large numbers of messages as well as multicast would. Because GFS2-based workloads often can result in these increased levels of communication, they are often susceptible to demonstrating worse performance on UDP-unicast than with UDP-multicast. As such, Red Hat does not recommend using UDPU with GFS2-based workloads on larger clusters.
The performance implications of UDPU can also be expected to increase with larger clusters.
NOTE: In some cases other workloads besides those based on GFS2 with large amounts of group-messaging traffic such as those that use cmirror or messaging applications could suffer similar degradation by the use of UDPU. It is recommended that each use deployment with UDPU be thoroughly evaluated and tested to determine which transport is optimal.
NOTE: In practice, the use of multicast instead of unicast is unlikely to make much of a difference except in larger clusters where plock use is heavy. GFS2 network traffic is nearly all DLM, which is TCP and independent of corosync's transport protocol. If plocks are being used, then they go over CPG and can be slowed down if there are a lot of them and unicast in used. cmirror also uses CPG heavily and may perform better when using multicast, but multicast has not been compared vs. unicast for that situation.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.