Diagnostic Procedures for RHEL High Availability Clusters - Measuring Inter-Node Network Latency

Updated

Contents

Applicable Environments

  • Red Hat Enterprise Linux (RHEL) 5 to 9 with the High Availability Add-On

Recommended Prior Reading

Useful References and Guides

Procedures

Determine Node Names or Addresses

Consult the cluster configuration to determine how the membership list is defined and collect the names or addresses of nodes as they exist in the configuration.

  • In RHEL 7 to 9, look in /etc/corosync/corosync.conf for ringX_addr definitions within node blocks to define each node.
    • In some environments, the cluster membership may be more loosely defined by an interface block, in which case the bindnetaddr defines what network or interface this node will communicate over. The addresses of each node would need to be individually determined on those hosts.
  • In RHEL 5 or 6, look in /etc/cluster/cluster.conf for <clusternode name/> definitions. Optionally, there may also be <altname name/> definitions.
  • In any release, if multiple names or addresses exist for each node - as in RRP or KNET configurations - these tests should be performed for all given addresses.

Measure Direct Node-to-Node Communication Latency

Use ping from every node to every other node in the cluster to measure average round-trip communication latency. Use the names or addresses identified in the configuration in the previous section, exactly as they are specified there - do not shorten names, translate names to addresses manually, or otherwise alter the way the nodes are identified.

For example, in a two-node cluster:

[root@node1]# ping -c 100 node2.example.com
PING node2.example.com (192.168.2.172) 56(84) bytes of data.
64 bytes from node2.example.com (192.168.2.172): icmp_seq=1 ttl=64 time=0.703 ms
[...]
64 bytes from node2.example.com (192.168.2.172): icmp_seq=10 ttl=64 time=0.488 ms

--- node2.example.com ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 90007ms
rtt min/avg/max/mdev = 0.488/0.652/0.785/0.089 ms
[root@node2]# ping -c100 node1.example.com
PING node1.example.com (192.168.2.171) 56(84) bytes of data.
64 bytes from node1.example.com (192.168.2.171): icmp_seq=1 ttl=64 time=0.674 ms
[...]
64 bytes from node1.example.com (192.168.2.171): icmp_seq=10 ttl=64 time=0.687 ms

--- node1.example.com ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9005ms
rtt min/avg/max/mdev = 0.560/0.647/0.687/0.053 ms

The values to consider are rtt avg and rtt max. If either of these falls outside the applicable limit, measures may be required to correct the high latency or choose a different communication network.


Measure Message Latency

Clusters utilizing the udp transport protocol - which is the default protocol in RHEL 5 and 6 clusters - transmit messages using either multicast or broadcast facilities. Latency limits would apply to these communications as well.

NOTE: This guidance is not applicable in environments using the udpu transport protocol, as messages are transmitted directly between nodes in such clusters and the previous tests would apply equally to message traffic.

omping should be utilized for testing messaging latency, using either multicast or broadcast depending on the configuration of the cluster.

Using omping to test multicast latency

To test multicast latency, execute the same omping command from all nodes at the same time and let it run to completion. The general syntax is:

# omping -m <multicast address> -p <port> -c <count> <node address> <node address> [...]

Notes:

  • To ensure the accuracy of results, it is best to run this test on the specific multicast address that the cluster will use. This address is either defined in the configuration - /etc/corosync/corosync.conf in RHEL 7, /etc/cluster/cluster.conf in RHEL 5 or 6 - or would be auto-configured by the cluster otherwise. For the latter case, in RHEL 7 check corosync-cmapctl for totem.interface.X.mcastaddr, or in RHEL 6 check cman_tool status to determine the address.

  • If specifying the same port that the cluster uses - 5405 - then, the cluster cannot be running during this test. Another port can be specified instead if needed. That port must not be blocked by any firewall, in either case.

  • As in the previous test, the node names should be specified exactly as they occur in the configuration.

  • In RHEL 5, omping is only available via the Content from fedoraproject.org is not included.EPEL repositories, and is not available directly from Red Hat-provided RHEL repositories.

For example, in a two-node cluster:

[root@node1 ~]# omping -m 239.192.77.2 -p 5405 -c 100 node1.example.com node2.example.com
node2.example.com : waiting for response msg
node2.example.com : joined (S,G) = (*, 239.192.77.2), pinging
node2.example.com :   unicast, seq=1, size=69 bytes, dist=0, time=0.706ms
node2.example.com : multicast, seq=1, size=69 bytes, dist=0, time=0.735ms
node2.example.com :   unicast, seq=2, size=69 bytes, dist=0, time=0.719ms
node2.example.com : multicast, seq=88, size=69 bytes, dist=0, time=0.849ms
[...]
node2.example.com : given amount of query messages was sent

node2.example.com :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.399/0.719/1.316/0.093
node2.example.com : multicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.500/0.758/1.350/0.086
[root@node2 ~]# omping -m 239.192.77.2 -p 5405 -c 100 node1-clust.pvt node2-clust.pvt
node1.example.com : waiting for response msg
node1.example.com : joined (S,G) = (*, 239.192.77.2), pinging
node1.example.com :   unicast, seq=1, size=69 bytes, dist=0, time=0.754ms
node1.example.com : multicast, seq=1, size=69 bytes, dist=0, time=0.833ms
node1.example.com :   unicast, seq=2, size=69 bytes, dist=0, time=0.713ms
node1.example.com : multicast, seq=88, size=69 bytes, dist=0, time=0.849ms
[...]
node1.example.com : given amount of query messages was sent

node1.example.com :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.555/0.772/1.491/0.104
node1.example.com : multicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.650/0.821/1.523/0.101

In this test, both the unicast and multicast avg and max values are indicative of the latency we are interested in. If either of these falls above the relevant limit, it may indicate a need for corrective action or a change in the cluster architecture.

Using omping to test broadcast latency

To test multicast latency, execute the same omping command from all nodes at the same time and let it run to completion. The general syntax is:

# omping -M ipbc -p <port> -c <count> <node address> <node address> [...]

Notes:

  • If specifying the same port that the cluster uses - 5405 - then the cluster cannot be running during this test. Another port can be specified instead if needed. That port must not be blocked by any firewall, in either case.

  • As in the previous test, the node names should be specified exactly as they occur in the configuration.

  • In RHEL 5, omping is only available via the Content from fedoraproject.org is not included.EPEL repositories, and is not available directly from Red Hat-provided RHEL repositories.

[root@node1]# omping -M ipbc -p 5405 -c 100 node1.example.com node2.example.com
node2.example.com : waiting for response msg
node2.example.com : joined (S,G) = (*, 192.168.2.255), pinging
node2.example.com :   unicast, seq=1, size=69 bytes, dist=0, time=0.695ms
node2.example.com : broadcast, seq=1, size=69 bytes, dist=0, time=0.817ms
node2.example.com :   unicast, seq=2, size=69 bytes, dist=0, time=0.782ms
node2.example.com : broadcast, seq=2, size=69 bytes, dist=0, time=0.748ms
[...]
node2.example.com : given amount of query messages was sent

node2.example.com :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.399/0.719/1.316/0.093
node2.example.com : broadcast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.500/0.758/1.350/0.086
[root@node2 ~]# omping -M ipbc -p 5405 -c 100 node1-clust.pvt node2-clust.pvt
node1-clust.pvt : waiting for response msg
node1-clust.pvt : joined (S,G) = (*, 192.168.2.255), pinging
node1-clust.pvt :   unicast, seq=1, size=69 bytes, dist=0, time=0.661ms
node1-clust.pvt : broadcast, seq=1, size=69 bytes, dist=0, time=0.722ms
node1-clust.pvt :   unicast, seq=2, size=69 bytes, dist=0, time=0.655ms
node1-clust.pvt : broadcast, seq=2, size=69 bytes, dist=0, time=0.696ms
[...]
node1.example.com : given amount of query messages was sent

node1.example.com :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.555/0.772/1.491/0.104
node1.example.com : multicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.650/0.821/1.523/0.101

In this test, both the unicast and broadcast avg and max values are indicative of the latency we are interested in. If either of these falls above the relevant limit, it may indicate a need for corrective action or a change in the cluster architecture.

SBR
Category
Article Type