[OVN] Multi-homed node - No Route From POD network to second NIC network

Solution Verified - Updated 13 Jun 2024

Environment

Red Hat Openshift Container Platform (RHOCP)
- 4.8.z
- 4.9.z
OVN-Kubernetes
RHOCP nodes with multiple NIC configuration (Multi-homed node)

Issue

After upgrading from 4.7.z to 4.8.z, RHOCP cluster doesn't provide the expected routing path from POD internal network to external networks through secondaries NICs on RHOCP nodes.

Resolution

To prevent the issue to happen during an upgrade from RHOCP 4.7.z to 4.8.z, and later from 4.8.z to 4.9.z, it must be created a configmap to force OVN Gateway mode to local:

Create the gateway-mode-config configmap in the namespace openshift-network-operator:

  $ cat sharedGW.yml 
  apiVersion: v1
  kind: ConfigMap
  metadata:
      name: gateway-mode-config
      namespace: openshift-network-operator
  data:
      mode: "local"
  immutable: true

  $ oc apply -f sharedGW.yml 
  configmap/gateway-mode-config created

Upgrade to RHOCP 4.8.z.

Check the gateway mode config used by the ovn master PODs:

$ oc logs ovnkube-master-<ID> -c ovnkube-master |grep -i mode
+ gateway_mode_flags='--gateway-mode local --gateway-interface br-ex'
+ exec /usr/bin/ovnkube --init-master ip-10-0-138-37.ec2.internal --config-file=/run/ovnkube-config/ovnkube.conf 
--ovn-empty-lb-events --loglevel 4 --metrics-bind-address 127.0.0.1:29102 --metrics-enable-pprof 
--gateway-mode local --gateway-interface br-ex

Root Cause

Previously, with RHOCP 4.7.z, by default the OVN gateway mode was configured as local, which means that all traffic went via the host for routing. Due to this behavior, customers could allow PODs to reach external services from internal network to go to a non-default gateway.

Starting in RHOCP 4.8.z, OVN gateway mode is now configured by default as shared, and this behavior no longer works, because traffic egresses the node without going to the host for routing and only uses routes on in OVN.

Starting in RHOCP 4.10.z, this has been fixed by a Content from github.com is not included.new API called "routingViaHost", which will force all egress traffic to go via host network namespace first if enabled.

Diagnostic Steps

Using the following Three-Node OpenShift Compact Cluster configuration, it was added a second NIC to each node in the subnet 10.97.224.0/22
- master1:
  - primary ip (MachineNetworkCIDR): 192.168.100.10/24
  - secondary ip: 192.168.200.10/24
- master2:
  - primary ip(MachineNetworkCIDR): 192.168.100.11/24
  - secondary ip: 192.168.200.11/24
- master3:
  - primary ip(MachineNetworkCIDR): 192.168.100.12/24
  - secondary ip: 192.168.200.12/24

POD IP 10.128.0.45 on master1:

$ oc get pods -o wide
NAME                                   READY   STATUS    RESTARTS   AGE   IP            NODE                           NOMINATED NODE   READINESS GATES
multitool-openshift-58d96959c4-6txzk   1/1     Running   0          16m   10.128.0.45   ip-10-0-128-253.ec2.internal   <none>           <none>

The cluster has been recently upgraded to RHOCP 4.8.39:

  $ oc get nodes -o wide
  NAME                           STATUS   ROLES           AGE   VERSION           INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
  ip-10-0-128-253.ec2.internal   Ready    master,worker   7h    v1.21.8+ed4d8fd   192.168.100.10   <none>        Red Hat Enterprise Linux CoreOS 48.84.202204202010-0 (Ootpa)   4.18.0-305.45.1.el8_4.x86_64   cri-o://1.21.6-3.rhaos4.8.git19780ee.2.el8
  ip-10-0-142-9.ec2.internal     Ready    master,worker   7h    v1.21.8+ed4d8fd   192.168.100.11     <none>        Red Hat Enterprise Linux CoreOS 48.84.202204202010-0 (Ootpa)   4.18.0-305.45.1.el8_4.x86_64   cri-o://1.21.6-3.rhaos4.8.git19780ee.2.el8
  ip-10-0-143-199.ec2.internal   Ready    master,worker   7h    v1.21.8+ed4d8fd   192.168.100.12   <none>        Red Hat Enterprise Linux CoreOS 48.84.202204202010-0 (Ootpa)   4.18.0-305.45.1.el8_4.x86_64   cri-o://1.21.6-3.rhaos4.8.git19780ee.2.el8

  $ oc get clusterversion
  NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
  version   4.8.39    True        False         3h35m   Cluster version is 4.8.39

Steps to Reproduce:

Test 1 - Not reachable - POD IP 10.130.0.57 on master1 can't ping to master3's second NIC 192.168.200.12:

  $ ping -c 3 192.168.200.12
  PING 192.168.200.12 (192.168.200.12) 56(84) bytes of data.

  --- 192.168.200.12 ping statistics ---
  3 packets transmitted, 0 received, 100% packet loss, time 2035ms

  $ tracepath -n 192.168.200.12
   1?: [LOCALHOST]                      pmtu 8901
   1:  10.128.0.1                                            1.383ms asymm  2 
   1:  10.128.0.1                                            1.222ms asymm  2 
   2:  100.64.0.3                                            1.458ms asymm  3 
   3:  no reply
  (..)

  28:  no reply
  29:  no reply
  30:  no reply
       Too many hops: pmtu 8901
       Resume: pmtu 8901 
  $

Test 2 - OK - POD IP 10.128.0.45 on master1 ping to master1's second NIC 192.168.200.10:

  $ ping -c 3 192.168.200.10
  PING 192.168.200.10 (192.168.200.10) 56(84) bytes of data.
  64 bytes from 192.168.200.10: icmp_seq=1 ttl=64 time=0.108 ms
  64 bytes from 192.168.200.10: icmp_seq=2 ttl=64 time=0.077 ms
  64 bytes from 192.168.200.10: icmp_seq=3 ttl=64 time=0.093 ms

  --- 192.168.200.10 ping statistics ---
  3 packets transmitted, 3 received, 0% packet loss, time 2042ms
  rtt min/avg/max/mdev = 0.077/0.092/0.108/0.012 ms

  $ tracepath -n 192.168.200.10
   1?: [LOCALHOST]                      pmtu 8901
   1:  10.128.0.1                                            1.636ms asymm  2 
   1:  10.128.0.1                                            1.059ms asymm  2 
   2:  192.168.200.10                                         1.125ms reached
       Resume: pmtu 8901 hops 2 back 1 
  $

Actual results: POD IP 10.130.0.57 on master1 can't ping to master3's second NIC 192.168.200.12.

SBR

Product(s)

Red Hat OpenShift Container Platform

Components

Category

Upgrade

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.