After upgrade access to ExternalIP with Openshift OVN-Kubernetes stopped working
Environment
Red Hat OpenShift Container Platform 4.8.34+, 4.9.23+ and 4.10.3+
Issue
After upgrading Red Hat OpenShift Container Platform (RHOCP), with OVN-Kubernetes, the ingress access to services via ExternalIP stopped working and result in "No Route to Host".
Resolution
If using "native" access to ExternalIP services in RHOCP with OVN-Kubernetes and after upgrading to 4.8.34 and above, 4.9.23 and above or 4.10.3 and above, the access stop work with issues like "No Route to Host" errors or connection time outs, there is a need to migrate these services' ExternalIPs to be managed by Ipfailover or MetalLB (in case of 4.9 and 4.10) or create the necessary routes on the infrastructure in order for the traffic to be able to reach the respective ExternalIPs defined on the services, as explained This page is not included, but the link has been rewritten to point to the nearest parent document.here.
In case there is a plan to upgrade RHOCP to the releases mentioned, migrate these services prior to the cluster upgrade and avoid any disruption in your users' services.
For example, using a scenario where multiple services in one project have ExternalIPs configured, a group of ipfailover replicas can be configured to expose those IPs:
$ cat ipfailover-deploy-example.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ipfailover-group1
labels:
ipfailover: group1
spec:
strategy:
type: Recreate
replicas: 2
selector:
matchLabels:
ipfailover: group1
template:
metadata:
labels:
ipfailover: group1
spec:
serviceAccountName: ipfailover
privileged: true
hostNetwork: true
containers:
- name: openshift-ipfailover
image: quay.io/openshift/origin-keepalived-ipfailover:<ocp-release>
ports:
- containerPort: 63000 --> when using multiple ipfailover deployments it is recommended to set different container and host ports to avoid conflicts.
hostPort: 63000
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
volumeMounts:
- name: lib-modules
mountPath: /lib/modules
readOnly: true
- name: host-slash
mountPath: /host
readOnly: true
mountPropagation: HostToContainer
- name: etc-sysconfig
mountPath: /etc/sysconfig
readOnly: true
- name: config-volume
mountPath: /etc/keepalive
env:
- name: OPENSHIFT_HA_CONFIG_NAME
value: "ipfailover-group1"
- name: OPENSHIFT_HA_VIRTUAL_IPS
value: "<ExternalIP_1>,<External_2>,<ExternalIP_3>" --> The ExternalIPs set on the services.
- name: OPENSHIFT_HA_VIP_GROUPS
value: "1"
- name: OPENSHIFT_HA_NETWORK_INTERFACE
value: "br-ex" ---> On OVN this is the external interface of the node
- name: OPENSHIFT_HA_MONITOR_PORT
value: "0" ---> When having multiple services managed by the same ipfailover group this needs to be set to 0. Otherwise one can just use the service port.
- name: OPENSHIFT_HA_VRRP_ID_OFFSET
value: "1"
- name: OPENSHIFT_HA_REPLICA_COUNT
value: "2"
- name: OPENSHIFT_HA_USE_UNICAST
value: "false"
- name: OPENSHIFT_HA_IPTABLES_CHAIN
value: "INPUT"
- name: OPENSHIFT_HA_CHECK_SCRIPT
value: "/etc/keepalive/mycheckscript.sh"
- name: OPENSHIFT_HA_PREEMPTION
value: "preempt_delay 300"
- name: OPENSHIFT_HA_CHECK_INTERVAL
value: "2"
Once the Ipfailover pods start the ExternalIPs can be seen on the nodes network configuration and access to the service will resume. To confirm this check on the nodes where ipfailover pods have been schedule:
# ip -d -c addr show
or
# nmcli -p dev show br-ex
Root Cause
There has been a change in the code of OVN-Kubernetes to avoid OVN services to answer ARPrequests for the LoadBalancer/ExternalIPs services and avoid conflicts with other services that perform the same task, like for example Ipfailover and MetalLB speaker pods. More information can be seen Content from github.com is not included.here.
Currently an update on the official documentation has been request on This content is not included.bugzilla ticket 2076662, in order to create a warning about this before an upgrade is done.
Diagnostic Steps
After upgrade if issues are noticed on the access to ExternalIPs, create a new project and a simple example application from the RHOCP catalog, like for example django+postgresql that creates 2 services which can be tested for http and tcp access. Once the template is deployed patch the services to add an ExternalIP to test connectivity:
$ oc get pods,services
NAME READY STATUS RESTARTS AGE
pod/django-psql-example-1-tm9wc 1/1 Running 0 3m55s
pod/postgresql-1-6ptd9 1/1 Running 0 6m44s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/django-psql-example ClusterIP 172.46.166.200 172.23.188.12 8080/TCP 6m59s
service/postgresql ClusterIP 172.46.176.75 172.23.188.18 5432/TCP 6m58s
$ curl -v -D - http://172.23.188.12:8080/
* Trying 172.23.188.12:8080...
* TCP_NODELAY set
* connect to 172.23.188.12 port 8080 failed: No route to host
* Failed to connect to 172.23.188.12 port 8080: No route to host
* Closing connection 0
curl: (7) Failed to connect to 172.23.188.12 port 8080: No route to host
$ ping 172.23.188.12
PING 172.23.188.12 (172.23.188.12) 56(84) bytes of data.
From 172.23.188.1 icmp_seq=1 Destination Host Unreachable
From 172.23.188.1 icmp_seq=2 Destination Host Unreachable
From 172.23.188.1 icmp_seq=3 Destination Host Unreachable
$ psql -h 172.23.188.18 -p 5432 -U django -d default
psql: error: could not connect to server: No route to host
Is the server running on host "172.23.188.18" and accepting
TCP/IP connections on port 5432?
$ ping 172.23.188.18
PING 172.23.188.18 (172.23.188.18) 56(84) bytes of data.
From 172.23.188.1 icmp_seq=1 Destination Host Unreachable
From 172.23.188.1 icmp_seq=2 Destination Host Unreachable
From 172.23.188.1 icmp_seq=3 Destination Host Unreachable
However the VIPs are still configured on the OVN NorthBound database as expected:
$ oc project openshift-ovn-kubernetes
$ oc exec -c northd <some-ovnkube-master-pod> -- ovn-nbctl --no-leader-only lb-list
tcp 172.23.188.12:8080 10.220.4.30:8080
tcp 172.23.188.18:5432 10.223.0.32:5432
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.