OVS-DPDK link status polling causes "Unreasonably long poll interval" warnings and packet drops

Solution Unverified - Updated

Environment

Red Hat OpenStack Platform 17.1 (OVS 3.3), DPDK physical ports with bonding, any NIC driver supporting LSC interrupts (mlx5, i40e, ice, ixgbe, bnxt, etc.).

Issue

In Red Hat OpenStack Platform 17.1 environments using OVS-DPDK, OVS logs show frequent warnings:

timeval|WARN|Unreasonably long 198ms poll interval (5ms user, 5ms system)

This is accompanied by packet drops and intermittent connectivity issues, particularly on high-core-count compute nodes using Mellanox (mlx5) or Intel (i40e/ice) NICs with bonded interfaces.

Resolution

Enable LSC interrupt mode on DPDK interfaces. The dpdk-lsc-interrupt option is a per-interface setting — it must be set on each individual DPDK interface, not on the bond itself.

  • Manual setting changes on nodes (impermanent):

    dpdk-lsc-interrupt can be set by the following command:
    Note: This settings is reverted when the interface is down and up or the node reboots.

      ovs-vsctl set interface <dpdkbond-interface-name> options:dpdk-lsc-interrupt=true
    

    To verify the setting is active:

      ovs-vsctl get interface dpdk0 options:dpdk-lsc-interrupt
    

    To keep the settings even after port down/up or node reboots, add set interface <dpdkbond-interface-name> options:dpdk-lsc-interrupt=true to OVS_EXTRA if the ifcfg file.
    If the ifcfg file already has OVS_EXTRA setting, append the settings with -- separator.
    Note: This setting can be reverted when updating network settings by Director deployments.

    For a standalone DPDK port, add ovs_extra to the port configuration:

      # vi /etc/sysconfig/network-scripts/ifcfg-dpdk0
         :
      OVS_EXTRA="...(existing settings)... -- set interface dpdk0 options:dpdk-lsc-interrupt=true"
    

    For bonded DPDK ports, the OVS_EXTRA must be set on the bond and must reference each member interface by name explicitly.

      # vi /etc/sysconfig/network-scripts/ifcfg-dpdkbond0
         :
      OVS_EXTRA="...(existing settings)... -- set interface dpdk0 options:dpdk-lsc-interrupt=true -- set interface dpdk1 options:dpdk-lsc-interrupt=true"
    
  • Setting via Director deployment(impermanent):

    The settings can be written on ovs_extra in custom network interface template files.

    For a standalone DPDK port, add ovs_extra to the port configuration:

      - type: ovs_dpdk_port
        name: dpdk0
        ovs_extra:
          - set Interface $DEVICE options:dpdk-lsc-interrupt=true
    

    For bonded DPDK ports, the ovs_extra must be set on the bond and must reference each member interface by name explicitly. The $DEVICE variable cannot be used here as it resolves to the bond name, not the individual member interfaces:

      - type: ovs_dpdk_bond
        name: dpdkbond0
        ovs_extra:
          - set Interface dpdk0 options:dpdk-lsc-interrupt=true
          - set Interface dpdk1 options:dpdk-lsc-interrupt=true
        members:
          - type: ovs_dpdk_port
            name: dpdk0
            members:
              - type: interface
                name: nic2
          - type: ovs_dpdk_port
            name: dpdk1
            members:
              - type: interface
                name: nic3
    

    To apply the changes on existing nodes, please refer to Updating network configuration after a deployment With Red Hat OpenStack Platform Director.

    To verify the setting is active:

      ovs-vsctl get interface dpdk0 options:dpdk-lsc-interrupt
    

Root Cause

OVS 3.3 (shipped with RHOSP 17.1) defaults dpdk-lsc-interrupt to false, meaning link status changes are detected by polling. In polling mode, every link status query calls the driver's link_update() operation which reads NIC registers directly. On mlx5, this goes through ethtool kernel calls that can block on the RTNL lock for an indeterminate amount of time. Since bond_run() queries link status under the bond write lock, datapath PMD threads calling bond_check_admissibility() stall waiting for the read lock, causing the long poll interval warnings and packet drops.

This affects all DPDK drivers, not only mlx5. Most drivers support LSC interrupts (Intel ixgbe, ice, i40e, iavf, igb, Broadcom bnxt, Marvell/Cavium cnxk, virtio, etc.) and benefit from interrupt mode: zero-cost link status queries and sub-millisecond bonding failover detection. The only trade-off is consuming one interrupt vector per port. Drivers that lack LSC support are unaffected since OVS automatically falls back to polling.

The mlx5 case is the most impactful due to its bifurcated architecture, but it will benefit all drivers. LSC interrupt does not have any negative side effects.

Diagnostic Steps

Check ovs-vswitchd.log for lines like the following:

timeval|WARN|Unreasonably long 198ms poll interval (5ms user, 5ms system)

And associated RX missed packets on physical ports:

$ ovs-vsctl list interface | grep -e '^statistics.*rx_missed_errors=[1-9]' -e '^name' | grep -B1 'rx_missed_errors=[1-9]'
name                : dpdk3
statistics          : {ovs_rx_qos_drops=0, ovs_tx_failure_drops=0, ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, rx_broadcast_packets=7957071, rx_bytes=64314042275774, rx_dropped=8674305, rx_errors=0, rx_mbuf_allocation_errors=0, rx_missed_errors=8674305, rx_multicast_packets=5529717, rx_packets=69693289184, rx_phy_crc_errors=0, rx_phy_in_range_len_errors=0, rx_phy_symbol_errors=0, rx_q0_bytes=20683199971667, rx_q0_errors=0, rx_q0_packets=22000448457, rx_q1_bytes=14496966375476, rx_q1_errors=0, rx_q1_packets=16017393152, rx_q2_bytes=14649113370331, rx_q2_errors=0, rx_q2_packets=15845519680, rx_q3_bytes=14484762511406, rx_q3_errors=0, rx_q3_packets=15829927851, rx_wqe_errors=0, tx_broadcast_packets=88370, tx_bytes=108518015442385, tx_dropped=0, tx_errors=0, tx_multicast_packets=527761, tx_packets=93383866827, tx_phy_errors=0, tx_pp_clock_queue_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_timestamp_future_errors=0, tx_pp_timestamp_order_errors=0, tx_pp_timestamp_past_errors=0, tx_q0_bytes=65193372, tx_q0_packets=525753, tx_q10_bytes=0, tx_q10_packets=0, tx_q11_bytes=0, tx_q11_packets=0, tx_q12_bytes=0, tx_q12_packets=0, tx_q13_bytes=0, tx_q13_packets=0, tx_q14_bytes=0, tx_q14_packets=0, tx_q15_bytes=0, tx_q15_packets=0, tx_q1_bytes=0, tx_q1_packets=0, tx_q2_bytes=7706254713, tx_q2_packets=11952855, tx_q3_bytes=0, tx_q3_packets=0, tx_q4_bytes=108505929319869, tx_q4_packets=93355770650, tx_q5_bytes=0, tx_q5_packets=0, tx_q6_bytes=0, tx_q6_packets=0, tx_q7_bytes=0, tx_q7_packets=0, tx_q8_bytes=4284048222, tx_q8_packets=15511904, tx_q9_bytes=30611576, tx_q9_packets=105646}
--
name                : dpdk2
statistics          : {ovs_rx_qos_drops=0, ovs_tx_failure_drops=0, ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, rx_broadcast_packets=7177652, rx_bytes=71131846140441, rx_dropped=5198496, rx_errors=0, rx_mbuf_allocation_errors=0, rx_missed_errors=5198496, rx_multicast_packets=1783367, rx_packets=76376642722, rx_phy_crc_errors=0, rx_phy_in_range_len_errors=0, rx_phy_symbol_errors=0, rx_q0_bytes=27024225596073, rx_q0_errors=0, rx_q0_packets=28306504928, rx_q1_bytes=14866612128240, rx_q1_errors=0, rx_q1_packets=16303293673, rx_q2_bytes=14783981980462, rx_q2_errors=0, rx_q2_packets=15985320232, rx_q3_bytes=14457026428546, rx_q3_errors=0, rx_q3_packets=15781523868, rx_wqe_errors=0, tx_broadcast_packets=763236, tx_bytes=35216498405297, tx_dropped=0, tx_errors=0, tx_multicast_packets=675261, tx_packets=63472836202, tx_phy_errors=0, tx_pp_clock_queue_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_timestamp_future_errors=0, tx_pp_timestamp_order_errors=0, tx_pp_timestamp_past_errors=0, tx_q0_bytes=65188908, tx_q0_packets=525717, tx_q10_bytes=0, tx_q10_packets=0, tx_q11_bytes=0, tx_q11_packets=0, tx_q12_bytes=0, tx_q12_packets=0, tx_q13_bytes=0, tx_q13_packets=0, tx_q14_bytes=0, tx_q14_packets=0, tx_q15_bytes=0, tx_q15_packets=0, tx_q1_bytes=0, tx_q1_packets=0, tx_q2_bytes=18852197580236, tx_q2_packets=18450635221, tx_q3_bytes=0, tx_q3_packets=0, tx_q4_bytes=210, tx_q4_packets=3, tx_q5_bytes=0, tx_q5_packets=0, tx_q6_bytes=0, tx_q6_packets=0, tx_q7_bytes=0, tx_q7_packets=0, tx_q8_bytes=12255179947529, tx_q8_packets=35918522873, tx_q9_bytes=4109055674785, tx_q9_packets=9103152283}
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.