Why are all interfaces not used in bonding Mode 2 or Mode 4?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux
  • Bonding or Teaming
  • Bonding Mode 2 (balance-xor)
  • Bonding Mode 4 (802.3ad) also known as LACP (Link Aggregation Control Protocol)
  • Bonding xmit_hash_policy is NOT layer3+4
  • Teaming loadbalance Runner
  • Teaming lacp Runner

Issue

  • Why all interfaces are not used in bonding mode=2 or mode=4 and uses only one interface?
  • In bonding mode 4 (802.3ad LACP) only specific NIC is used for receive (rx) and transmit (tx) and the rest of the NICs are not used:
eth0 - RX bytes:   58970058 (56.2 MiB)  TX bytes:20342326161 (18.9 GiB)    <-- most TX
eth1 - RX bytes:49296441110 (45.9 GiB)  TX bytes:    1872543 (1.7 MiB)    <-- most RX
eth2 - RX bytes:   66462573 (63.3 MiB)  TX bytes:    2495475 (2.3 MiB)
eth3 - RX bytes:   39914810 (38.0 MiB)  TX bytes:    4014612 (3.8 MiB)
  • Most traffic on Bonding Mode 2 is passed via one interface ethX and nearly nothing is passed via another interfaces. It is expected that both interfaces should be used so as to have more throughput.
  • Why does the outgoing network traffic only go to a specific interface in Red Hat Enterprise Linux when using bonding?
  • Traffic doesn't balance correctly across multiple interfaces when using load balancing bonding modes like 802.3ad LACP.
  • Why Link Aggregation bonding mode doesn't utilitze bandwidth of multiple NIC
  • I have configured 4x 1Gbps interfaces in a bond, why do I not get 4Gbps in iperf and other bandwidth tests?
  • Bonding performance is not as expected, it goes at the speed of a single interface, not at the speed of multiple interfaces

Resolution

If traffic is a single stream

If the traffic is a single stream, such as NFS or CIFS or ISCSI or iperf, then a single stream of traffic will only go as fast as a single interface.

It is not possible to load balance a single stream of traffic over multiple interfaces.

If speed faster than a single interface is required, faster network interfaces and faster network infrastructure must be used.

If traffic is not a single stream

If traffic is between multiple IP addresses and/or multiple TCP/UDP ports, better load balancing may be obtained by changing the xmit_hash_policy bonding option or the tx_hash Teaming runner parameter.

Add the parameter xmit_hash_policy to the BONDING_OPTS in the bonding configuration file (ifcfg-bondX):

BONDING_OPTS="mode=X miimon=100 xmit_hash_policy=<algorithm_name>"

For example:

BONDING_OPTS="mode=4 miimon=100 xmit_hash_policy=layer2+3"

Restart the network service to apply the changes:

service network restart

Confirm the change in the policy is reflected:

cat /sys/class/net/bond*/bonding/xmit_hash_policy

Root Cause

For transmit, outbound from the RHEL system

  • The selection of the transmit interface in Mode 2 and Mode 4 is done using a algorithm determined by the xmit_hash_policy.

  • The default value of this parameter is layer2. This algorithm uses an XOR of hardware MAC addresses to generate the hash. The formula is (source MAC XOR destination MAC) modulo interface count. This algorithm will place all traffic to a particular MAC address on the same interface.

  • If all traffic is generated to a particular MAC address, such as a single host on the same LAN or a default gateway, then the algorithm will choose the same network interface to transmit out of.

  • For better load balancing, the layer2+3 policy can be used. This uses the network layer information (IP address) in the balancing calculation, allowing traffic to multiple hosts beyond a default gateway to be load balanced.

  • There is also layer3+4 policy in the Linux bonding driver, this uses transport layer information (TCP/UDP port) in the calculation. This algorithm is not compliant with the 802.3ad specification, however in practice almost all network equipment supports it without problem.

  • If the bonding policy is already layer3+4 and traffic is still unevenly balanced, likely traffic is a single stream which cannot be better balanced, so there is nothing to do.

  • The transmit modes are described further in How are the values for different policies in "xmit_hash_policy" bonding parameter calculated? and in bonding.txt in the kernel-doc package.

  • It is not possible to balance a single stream over multiple interfaces, as there is nothing in a switch which guarantees the traffic arriving in multiple switchports is delivered in the order transmitted or received. This re-ordering of traffic significantly harms TCP's ability to ramp up the TCP Window to allow good performance, and will instead will cause TCP congestion control to activate which shrinks the TCP Window, resulting in poor performance. Re-ordering of traffic will also cause UDP traffic to be received out-of-order on the receiver. If the receiving application is not expecting this and/or the application protocol does not implement ordering of data, errors in data may occur.

  • Whilst it may be possible to "brute force" a slightly faster data connection with the Mode 0 (round-robin) or Mode 5 (balance-tlb) or Mode 6 (balance-alb) bonding modes or the round-robin Teaming runner, this will result in variable performance, constant TCP congestion control, risk of UDP data corruption due to re-ordering, and will make a network impossible to troubleshoot using packet captures or by inspecting TCP error statistics. Due to the complete lack of troubleshooting transparency in such a configuration, its usage is strongly recommended against.

For receive, inbound to the RHEL system

  • The sender of any LACP connection decides what is sent out which interface. This means the Linux system has no control over the load balancing of traffic which Linux receives.

  • The remote bonding peer (usually a switch) will likely have a similar load balancing policy. Different brands and models of switch have different capabilities. Some can balance by MAC address only, some can balance on IP address, some can balance on other criteria.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.