[OVN-InterConnection/OVN] Node to pod communication not working in Openshift 4.14

Solution Verified - Updated

Environment

  • OpenShift Container Platform
    • 4.14
  • RHEL 8.6/9.2
  • RHEL coreOS

Issue

  • After SDN to OVN migration node to pod communication fails.
  • Some of the OCP 4 RHEL or RHEL coreOS nodes aren't able to push pull images from the OCP internal registry as connections are timing out.
  • Node to pod communication is not working between RHEL or RHEL coreOS worker nodes.
  • Pod to pod communication is not working between RHEL or RHEL coreOS worker nodes.

Resolution

  • We don't support nodes created by cloned disk
  • Replace the cloned disk nodes with the new nodes.

Root Cause

  • Looking at when the cluster nodes were created, we see that the oldest node with the duplicate chassis ID is node06, then the other nodes with the identical chassis ID were all created the same morning which seems to creating this issue.
$ oc get nodes -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.annotations.k8s\.ovn\.org/node-chassis-id}{"\t"}{.metadata.creationTimestamp}{"\n"}{end}' | grep 808a04467-2dgg-4gf7-9ddb-7bc69sddsds

NAME       Node Chassis ID                         Creation Time Stamp
node06     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2023-02-07T06:58:32Z  
node09     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:57:48Z  
node10     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:51:46Z  
node11     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:15:37Z  
node12     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:33:18Z  
node13     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:45:52Z  
node14     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-01-26T10:08:07Z  
  • Incidentally those are also the only nodes that were created on that day:
$ oc get nodes -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.annotations.k8s\.ovn\.org/node-chassis-id}{"\t"}{.metadata.creationTimestamp}{"\n"}{end}' | grep 2024-02-06

NAME       Node Chassis ID                         Creation Time Stamp
node09     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:57:48Z
node10     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:51:46Z
node11     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:15:37Z
node12     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:33:18Z
node13     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-02-06T10:45:52Z
node14     808a04467-2dgg-4gf7-9ddb-7bc69sddsds    2024-01-26T10:08:07Z
  • Since the cluster was running SDN, system-ID went all went unnoticed, because SDN has no SouthBound DataBase and doesn't use the system-id as chassis_id. As one migrated form SDN to OVNK, this issue suddenly popped up.

Diagnostic Steps

  • We can see that from the one of working node, which you uploaded that there are only 19 nodes geneve port entry in br-int bridge of OVN.
#  egrep "\(ovn\-"  ovs-ofctl_-O_OpenFlow13_show_br-int

 2(ovn-24249b-0): addr:5a:43:ed:78:18:be
 3(ovn-6345a3-0): addr:de:b7:08:23:05:56
 4(ovn-3e1656-0): addr:f2:06:a4:65:af:6a
 5(ovn-808s04-1): addr:5a:44:b4:34:c4:0a
 6(ovn-5e1g6d-0): addr:ee:8d:ae:44:87:eb
 7(ovn-6e6v23-0): addr:76:8a:a0:67:e3:ed
 8(ovn-0d44e3-0): addr:86:47:cc:a2:99:f5
 9(ovn-994f68-0): addr:96:7b:4b:e5:a9:4f
 10(ovn-962a30-0): addr:e2:c9:56:76:b4:68
 11(ovn-bffg7a-0): addr:fa:bd:39:37:7e:3a
 12(ovn-f4sde7-0): addr:4a:9b:74:73:c4:d3
 13(ovn-785dfe5-0): addr:02:da:h5:b1:59:50
 14(ovn-b044ds1-0): addr:1a:e5:k7:e3:d2:b8
 15(ovn-1aa0fs-0): addr:66:d0:6f:0a:b7:72
 16(ovn-f0fhgj-0): addr:22:b9:8s:cc:74:1a
 17(ovn-ceajjj-0): addr:da:86:cc:f0:75:3c
 18(ovn-6680er-0): addr:de:14:aa:34:3d:7c
 19(ovn-03c867-0): addr:ee:a2:ca:0d:43:ca
 20(ovn-5f8489-0): addr:b2:0d:ba:59:15:47
  • There are 22 nodes, so there are 2 nodes which are affected here.

  • Below are allowed rules from above geneve ports(19 rules), and there are no rules for remaining affected nodes, in which node06 one of them:

#  grep TUN ovs-ofctl_-O_OpenFlow13_dump-flows_br-int | grep "table=0" | grep -v icmp
 cookie=0x0, duration=36269.294s, table=0, n_packets=23268, n_bytes=8001843, priority=100,in_port=2 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=16334, n_bytes=1358461, priority=100,in_port=9 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=86, n_bytes=12387, priority=100,in_port=8 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=162, n_bytes=21648, priority=100,in_port=15 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=55, n_bytes=9260, priority=100,in_port=4 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=52127, n_bytes=17948025, priority=100,in_port=6 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=103, n_bytes=14739, priority=100,in_port=3 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=119, n_bytes=14580, priority=100,in_port=7 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=17558, n_bytes=1440517, priority=100,in_port=18 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=108, n_bytes=14698, priority=100,in_port=16 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=120, n_bytes=18068, priority=100,in_port=20 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=91, n_bytes=12452, priority=100,in_port=10 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=140, n_bytes=20502, priority=100,in_port=13 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.294s, table=0, n_packets=112, n_bytes=14861, priority=100,in_port=12 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.293s, table=0, n_packets=237, n_bytes=27056, priority=100,in_port=5 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.293s, table=0, n_packets=128, n_bytes=18993, priority=100,in_port=17 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.293s, table=0, n_packets=915717, n_bytes=18917071849, priority=100,in_port=11 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.293s, table=0, n_packets=97, n_bytes=13465, priority=100,in_port=14 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)
 cookie=0x0, duration=36269.293s, table=0, n_packets=33798, n_bytes=7628812, priority=100,in_port=19 actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,40)


#  grep TUN ovs-ofctl_-O_OpenFlow13_dump-flows_br-int | grep "table=0" | grep -v icmp| wc -l
19

  • Same can be seen on the Tcpdumps. Here for the explanation, I took a random source port 50023. However, all the packets from the affected nodes are the same behaviour

  • Geneve pcap bad node source(bad-node), we can see from the node SYN packet hits geneve tunnel and then went out from main interface.[1] Here we see it did not get any response for its SYN and hence starts re-transmitting it.

#  tshark -nr non-working-node-genev_sys_6081.pcap -tad -Y "tcp.port == 5000 && tcp.port == 50023" -T fields  -e frame.number -e frame.time -e frame.time_delta_displayed  -e tcp.time_delta -e ip.src -e ip.dst -e ip.ttl -e ip.id -e tcp.port -e tcp.seq -e tcp.nxtseq -e tcp.ack -e _ws.col.Info 
4548    Apr  2, 2024 06:35:02.589328000 UTC    0.000000000    0.000000000    100.64.0.14    10.128.3.50    62    0x4fac    50023,5000    0    1    0    50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075107372 TSecr=0 WS=128
4556    Apr  2, 2024 06:35:03.648195000 UTC    1.058867000    1.058867000    100.64.0.14    10.128.3.50    62    0x4fad    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075108433 TSecr=0 WS=128
4562    Apr  2, 2024 06:35:05.696195000 UTC    2.048000000    2.048000000    100.64.0.14    10.128.3.50    62    0x4fae    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075110481 TSecr=0 WS=128
  • Same packet we can see on main-interface on the bad node main interface:

[1]

#  tshark -nr non-working-node-main-interface.pcap -tad -Y "tcp.port == 5000 && tcp.port == 50023" -T fields  -e frame.number -e frame.time -e frame.time_delta_displayed  -e tcp.time_delta -e ip.src -e ip.dst -e ip.ttl -e ip.id -e tcp.port -e tcp.seq -e tcp.nxtseq -e tcp.ack -e _ws.col.Info 
89831    Apr  2, 2024 06:35:02.589358000 UTC    0.000000000    0.000000000    10.205.56.236,100.64.0.14    10.205.56.140,10.128.3.50    64,62    0xdcb6,0x4fac    50023,5000    0    1    0    50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075107372 TSecr=0 WS=128
89907    Apr  2, 2024 06:35:03.648220000 UTC    1.058862000    1.058862000    10.205.56.236,100.64.0.14    10.205.56.140,10.128.3.50    64,62    0xdfb4,0x4fad    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075108433 TSecr=0 WS=128
90377    Apr  2, 2024 06:35:05.696211000 UTC    2.047991000    2.047991000    10.205.56.236,100.64.0.14    10.205.56.140,10.128.3.50    64,62    0xe713,0x4fae    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075110481 TSecr=0 WS=128
  • We can see the same packet trail on the geneve image registry pod node(working node). Here it seems dropping the packets here as we are seeing no packets reaching the pod in the final trail.
#  tshark -nr 0470-genev_sys_6081-working-node.pcap -tad -Y "tcp.port == 5000 && tcp.port == 50023" -T fields  -e frame.number -e frame.time -e frame.time_delta_displayed  -e tcp.time_delta -e ip.src -e ip.dst -e ip.ttl -e ip.id -e tcp.port -e tcp.seq -e tcp.nxtseq -e tcp.ack -e _ws.col.Info 
380342    Apr  2, 2024 06:35:02.589406000 UTC    0.000000000    0.000000000    100.64.0.14    10.128.3.50    62    0x4fac    50023,5000    0    1    0    50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075107372 TSecr=0 WS=128
380444    Apr  2, 2024 06:35:03.648222000 UTC    1.058816000    1.058816000    100.64.0.14    10.128.3.50    62    0x4fad    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075108433 TSecr=0 WS=128
380619    Apr  2, 2024 06:35:05.696195000 UTC    2.047973000    2.047973000    100.64.0.14    10.128.3.50    62    0x4fae    50023,5000    0    1    0    [TCP Retransmission] 50023 → 5000 [SYN] Seq=0 Win=27200 Len=0 MSS=1360 SACK_PERM TSval=4075110481 TSecr=0 WS=128
  • Pod interface trail
#  tshark -nr 0490-image-pod.pcap -tad -Y "tcp.port == 5000 && tcp.port == 50023" -T fields  -e frame.number -e frame.time -e frame.time_delta_displayed  -e tcp.time_delta -e ip.src -e ip.dst -e ip.ttl -e ip.id -e tcp.port -e tcp.seq -e tcp.nxtseq -e tcp.ack -e _ws.col.Info

>>> no o/p
  • We can see that we all these 2 nodes geneve ports seems to missing on image-registry node and affected nodes as well.
#  egrep "\(ovn\-"   0500-indhygstsdsprm01-flow.tar.gz/indhygstsdsprm01-flow/show.br-int| grep -v mp0 |wc -l
19
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.