Hypervisor active-backup bonding failover results in loss of guest network connectivity

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 5
  • KVM or Xen virtualization
  • Physical interfaces bonded with mode=1 or active-backup on hypervisor
  • Bonding interface in bridge, providing network connectivity to guests

Issue

  • Hypervisor active-backup bonding failover results in loss of guest network connectivity
  • When the cable is disconnected between the switch and the active bonding device of dom0, domU stops responding to ICMP packet although dom0 continues to respond. Why?
  • Temporary inbound guest communication failure upon bond link failover at the host
  • KVM bond failover results in VM unavailable
  • When a network bond (mode 1) does a failover, as long as no traffic comes from the VM after the failover, the VM is not reachable from outside. To me it seems like a gratuitous ARP is missing after failover.
  • Virtual guests disappear or unreachable from LAN after link down event on virtual machine host.

Resolution

This is expected behaviour.

Workaround - Different Bonding Mode on hypervisor and switch

Using a bonding mode which requires switch configuration, such as mode=2 aka balance-xor or mode=4 aka 803.2ad aka LACP, will work around this issue.

This is because the switch requires a "port group" configured with an EtherChannel or similar. The switch will send traffic for the guest into "the port group", and the port group's load balancing will handle any link failures.

If multiple separate switches are used, the switch must support a "virtual port group" across all switches, such as Virtual Port Channel.

Workaround - Gratuitous ARP within guest

Periodically transmit Gratuitous ARP packets from virtual guests which are bridged to the bond. This will update any switch or external network hosts who need to reach the VM. The following example may be run by the guest's rc.local or similar:

#!/bin/bash
MYIP="192.168.0.100"    ## change MYIP to the guest's local IP
MYDEV="eth0"            ## change MYDEV to the guest's network interface
while true; do 
    arping -q -U -c1 -I "$MYDEV" "$MYIP"
    sleep 5
done

Root Cause

When a link failover occurs in an active-backup bond, the bonded host sends Gratuitous ARP traffic over the new active bond slave, for all IP addresses configured on top of the bond interface (and VLAN IPs configured on it). This is what implements the actual failover, as this GARP updates the port forwarding information(MAC address table) at the switch for the IPs and MAC address(es) owned by the bond. The switch sees the traffic coming into the new switchport, and learns the new switchport location for the bond's MAC address.

When this bond interface is plugged into a virtual bridge to provide reliable communication for virtual guests, there is no way the bonding code can know about all the IPs and MAC addresses which are configured in the guests behind that bridge. Likewise, virtual guests have no way of knowing that they are connected into a physically bonded interface, or that failover has occurred. This means that the port forwarding information for those guests will not be updated at the switch, which will still map the guest MAC addresses to the switchport which has the failed link of the bond.

Only when outbound packets are created at the guests, which travel through the virtual bridge and out of the bonding interface over the new active link, will the switch learn the new location of those guest MAC addresses and start forwarding packets to the correct switchport. This can take some time, until either the guests generate traffic themselves, or a peer on the network sends an ARP broadcast for the guest which causes the guest to generate a reply (which is outbound traffic) and update the forwarding database on the switch.

Diagnostic Steps

  • Start a ping from an external host to a guest running on a host with bonding setup as the virtual bridge outgoing interface.
  • Interrupt the active bonding link to cause a failover.
  • Verify the pings fail even after link failover at the host, until some traffic is generated from the guest.
  • Perform the same link failover test, this time pinging an external host from the guest, and notice it is not affected by link failover.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.