How to configure TCP BBR congestion control algorithm?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 8
  • TCP (Transmission Control Protocol)

Issue

  • How to configure TCP BBR congestion control algorithm?
  • How do I use Google's "Bottleneck Bandwidth and Round-trip propagation time" on RHEL 8?

Resolution

System-Wide Default

The TCP Congestion Control Algorithm can be set system-wide using kernel tunables.

To change at runtime, use the command:

# sysctl -w net.ipv4.tcp_congestion_control=bbr

To make this persistent across reboots, add to /etc/sysctl.conf like:

# echo "net.ipv4.tcp_congestion_control = bbr" >> /etc/sysctl.conf

To verify the current setting, check all kernel tunables like:

# sysctl -a | egrep congestion
net.ipv4.tcp_allowed_congestion_control = reno cubic bbr
net.ipv4.tcp_available_congestion_control = reno cubic bbr
net.ipv4.tcp_congestion_control = bbr

A change in the congestion control algorithm only affects new TCP connections.

Existing connections need to be restarted for the change to take effect.

Application-Specific Code

To make a congestion control algorithm available to applications as-needed, ensure the module is loaded by the system administrator:

# modprobe tcp_bbr

Module loading can be persisted at boot as described at:

# echo "tcp_bbr" >> /etc/modules-load.d/tcp_bbr.conf
# systemctl enable systemd-modules-load

Confirm BBR is loaded and in the allowed CCAs:

# lsmod | grep bbr
tcp_bbr                20480  0

# sysctl net.ipv4.tcp_allowed_congestion_control
net.ipv4.tcp_allowed_congestion_control = reno cubic bbr

An application with a TCP socket can select from the available CCAs using the TCP_CONGESTION socket option.

An example in C might be:

    char buf[256];
    socklen_t len;
    strcpy(buf, "bbr");
    len = strlen(buf);
    setsockopt(sockfd, IPPROTO_TCP, TCP_CONGESTION, buf, len);

Other Considerations

The need to change CCA in the first place

The idea of a "congestion control algorithm" is to control "congestion", where traditional algorithms viewed packet-loss as congestion, where BBR views latency as congestion.

BBR is designed to perform better over global links with tens or hundreds of milliseconds of latency and possibly too-large router buffers (these too-large buffers are known in the industry as "bufferbloat").

If the network is low-latency and does not suffer any packet loss, then there is likely little or no benefit to changing the CCA as there is no actual "congestion" to control around.

Co-habiting with other TCP CCAs

It is strongly recommended to research and test TCP BBR in production-like workloads, and observe the result on the traffic using BBR and traffic not using BBR.

TCP BBR is likely unsuitable for use on networks with shared TCP traffic. BBR has been shown to significantly reduce the throughput of non-BBR streams with which it shares network bandwith. It is expected that a BBR stream could unfairly overwhelm a stream using CUBIC or other TCP CCA:

Queueing Discipline and Pacing

Some documentation specifies that the fq traffic queueing discipline (qdisc) must be used with TCP BBR, as BBR relies on traffic pacing.

However, Linux has had generic timer-based pacing of all TCP since patch Content from git.kernel.org is not included.tcp: internal implementation for pacing included in upstream Linux v4.13.

RHEL 8's v4.18-based kernel does not require the fq qdisc for pacing, and other qdiscs may result in higher throughput depending on situation.

However, using TCP timer-based pacing will create one kernel timer per socket, so results in more processor timer interrupts. Many sockets will result in many interrupts.

Benchmarking of BBR should also take CPU and interrupt usage into consideration, not just network throughput.

BBR is not a "go-fast" button

There was significant technical press around TCP BBR on its introduction in 2016, plus with follow up usage and research.

This has led to the unfortunate perception that BBR is better in every situation, with one source even claiming BBR to be "magic".

The reality is significantly more complex. Determining if TCP BBR is ideal for a situation requires deep understanding and testing of many aspects of real workloads to determine an overall benefit or not.

Root Cause

The default congestion control algorithm in RHEL 8 is cubic with many others also available for use.

A congestion control algorithm affects only the rate of transmission for a TCP connection. The CCA will have no affect on how fast the system receives traffic.

Diagnostic Steps

The ss command with the -i option will list the congestion control algorithm in use by the current TCP connections. In the example below it is bbr:

# ss -tin sport = :22
State    Recv-Q    Send-Q    Local Address:Port    Peer Address:Port
ESTAB    0         0           192.168.0.2:22       192.168.0.1:xxxxx
    bbr ...(other output omitted for clarity)...

The tc qdisc show command can be used to see the current queuing discipline (qdisc) attached to an interface. In the example below, interface net has the mq (multi-queue) root qdisc in place, with a child qdisc of fq_codel:

# tc qdisc show
qdisc mq 0: dev net2 root 
qdisc fq_codel 0: dev net2 parent :1
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.