What is the best bonding or teaming mode for TCP traffic such as NFS, ISCSI, CIFS, etc?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (all versions)
  • Bonding or Teaming
  • Large streaming TCP traffic such as NFS, Samba/CIFS, ISCSI, rsync over SSH/SCP, backups

Issue

  • What is the best bonding mode for TCP traffic such as NFS and Samba/CIFS?
  • NFS repeatedly logs nfs: server not responding, still trying when no network issue is present
  • A packet capture displays many TCP retransmission, TCP Out-of-order, RPC retransmission, when there should be no reason for this.

Resolution

If using bonding, use a mode which guarantees in-order delivery of TCP traffic such as:

  • Bonding Mode 1 (active-backup)
  • Bonding Mode 2 (balance-xor)
  • Bonding Mode 4 (802.3ad aka LACP)
  • Bonding Mode 5 (balance-tlb) with tlb_dynamic_lb=0
  • Bonding Mode 6 (balance-alb) with tlb_dynamic_lb=0
    • Note: tlb_dynamic_lb is only available in RHEL 7.2 and later

If using teaming, use a runner which guarantees in-order delivery of TCP traffic such as:

  • activebackup
  • loadbalance
  • lacp

Note that Bonding Mode 2 (balance-xor) and Teaming Runner loadbalance require an EtherChannel or similar configured on the switch, and Bonding Mode 4 (802.3ad) and Teaming Runner lacp require an EtherChannel with LACP on the switch. Bonding Mode 1 (active-backup) and Teaming Mode activebackup require no switch configuration.

Bonding Modes 5 (balance-tlb) and 6 (balance-alb) do not require switch configuration. Mode 5 has no capability to balance traffic back into the bond. Mode 6 balances transmit by intercepting ARP requests, so may not be suitable for all situations such as where traffic mostly goes through a default gateway.

For advice on configuring bonding, refer to How do I configure a bonding device on Red Hat Enterprise Linux (RHEL)?

For advice on configuring teaming, refer to RHEL 7 Networking Guide - 8.13. Configure teamd Runners

For advice on picking a specific hash policy for your traffic, refer to Why are all interfaces not used in bonding Mode 2 or Mode 4?

Root Cause

The following bonding modes:

  • Bonding Mode 0 (round-robin)
  • Bonding Mode 3 (broadcast)
  • Bonding Mode 5 (balance-tlb) with tlb_dynamic_lb=1
  • Bonding Mode 6 (balance-alb) with tlb_dynamic_lb=1
    • Note: tlb_dynamic_lb=1 is also the default behaviour before RHEL 7.2

And the following teaming runners:

  • random
  • roundrobin
  • broadcast

Do not guarantee in-order delivery of TCP streams, as each packet of a stream may be transmitted down a different slave, and no switch guarantees that packets received in different switchports will be delivered in order.

Given the following example configuration:

.---------------------------.
| bond mode 0 (round-robin) |
'---------------------------'
| eth0 | eth1 | eth2 | eth3 |
'--=---'--=---'---=--'---=--'
   |      |       |      |
   |      |       |      |
.--=------=-------=------=--.
|          switch           |
'---------------------------'

The bond system may send traffic out each slave in a correct order, like ABCD ABCD ABCD, but the switch may forward this traffic in any random order, like CADB BDCA DACB.

As TCP on the receiver expects to be presented a TCP stream in-order, this causes the receiver to think it's missed packets and request retransmissions, to spend a great deal of time reassembling out-of-order traffic in to be in the correct order, and for the sender to waste bandwidth sending retransmissions which are not really required.

The "good" bonding modes and teaming runners listed in the Resolution section avoid this issue by transmitting traffic for one destination down the one slave. Bonding load balancing algorithm can be altered by the xmit_hash_policy bonding option, and Teaming's load balancing algorithm can be altered by the tx_hash option, but they will never balance a single TCP stream down different ports and so will avoid the problematic re-ordering behaviour discussed above.

It is not possible to effectively balance a single TCP stream across multiple bonding or teaming devices. If higher speed is required for a single stream, then faster interfaces (and possibly faster network infrastructure) must be used.

This theory applies to all TCP and UDP streams. The most common occurrences of this issue are seen on high-speed long-lived TCP streams such as NFS, Samba/CIFS, ISCSI, rsync over SSH/SCP, and so on. This theory can also apply to SCTP with the Teaming l4 hash parameter.

Diagnostic Steps

Inspect syslog for nfs: server X not responding, still trying and nfs: server X OK messages when there are no other network issues.

Inspect a packet capture for many occurrences of TCP retransmission, TCP Out-of-Order, RPC retransmission, or other similar messages.

Inspect bonding mode in /proc/net/bonding/bondX or teaming mode in teamdctl teamX state.

Components

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.