Rebuild all OpenShift-OVN-Kubernetes databases
Index
- Introduction
- Environment
- Rebuild OVN on OCP 4.6
- Rebuild OVN on OCP 4.7
- Rebuild OVN on OCP 4.8 - 4.13
- Rebuild OVN on OCP 4.14 and later
Introduction
In this article, we will go through the different procedures of doing a completely rebuild of OVN-Kubernetes in case issues like the following happen:
OVN MasterSplit-brains- Failure to spawn pods due to
OVNissues - Myriad of OVN issues, such has complete inconsistencies on the
NorthBoundandSouthBounddatabases.
WARNING: This procedure will cause cluster-wide network interruption and has some risk. Please perform it only if instructed to do so or in case of OVN-Kubernetes issues.
NOTE: Methods are slightly different between RHOCP v4.6 and later releases. One of the main reasons is that OVS no longer lives in ovs-node pods in RHOCP v4.7 but it's a systemd process. Be sure to follow the right method and not omit any step.
RECOMMENDATION: It is recommended to do these procedures as a cluster-admin user authenticated via client certificates, like the system:admin (not to be confused with kubeadmin user, which authenticates via oauth). This is recommended because the procedure can cause OpenShift containers (in particular, pods in the openshift-apiserver, openshift-oauth-apiserver, openshift-ingress and openshift-authentication namespaces) to fail health checks because of the state of ovn-kubernetes during this process. Therefore, running the commands as a user that authenticates via oauth will have inconsistencies when connecting to the cluster to complete the procedure. Inversely, users authenticated via client certificates, like the system:admin user, do not go through oauth, and therefore it will avoid this issue. For more information on the system:admin user, please see About the OpenShift 4 kubeconfig file for system:admin.
NOTE: See also this related KCS regarding WebHooks preventing pod scheduling that frequently is observed during OVN database rebuilds that may become relevant if you encounter issues with pods failing to reschedule while working through the steps below.
Environment
-
Red Hat OpenShift Container Platform (RHOCP)
- 4
-
OVN-Kubernetes
Rebuild OVN on RHOCP v4.6
-
Confirm that a new revision of the
kube-apiserveris not in the process of being rolled out:$ oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}' -
Remove the
northboundandsouthbounddatabases and delete theovnkube-masterpods one-at-a-time:$ for OVNKUBEMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting databases for pod $OVNKUBEMASTER" ; \ oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNKUBEMASTER rm -f /etc/openvswitch/ovnnb_db.db /etc/openvswitch/ovnsb_db.db; \ done $ for OVNKUBEMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod $OVNKUBEMASTER" ; \ oc -n openshift-ovn-kubernetes delete pod --wait=false $OVNKUBEMASTER ; sleep 6; \ done -
Validate the health by confirming there's a
northboundandsouthboundLeader. This can take a few minutes to recover. If there's multiple leaders for either NB or SB, that may be asplit-brainand the first step must be performed again:for OVNMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "········································" ; \ echo "· OVNKube Master: $OVNMASTER ·" ; \ echo "········································" ; \ echo 'North' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound | grep ^Role` ; \ echo 'South' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound | grep ^Role`; \ echo "····················"; \ done -
The output of the command above should look like this
········································ · OVNKube Master: ovnkube-master-xxx ········································ North Role: leader South Role: leader ···················· ········································ · OVNKube Master: ovnkube-master-yyy ········································ North Role: follower South Role: follower ···················· ········································ · OVNKube Master: ovnkube-master-zzz ········································ North Role: follower South Role: follower ···················· -
Delete
ovs-nodeandovnkube-nodepods on masters and validate health:$ for OVSMASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod on node $OVSMASTER" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pods -l app=ovs-node --field-selector spec.nodeName=$OVSMASTER -o name) ; sleep 3; \ done $ for OVNNODEMASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod on node $OVNNODEMASTER" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNNODEMASTER -o name) ; sleep 3; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide -
Delete
ovs-nodeandovnkube-nodepods on non-master nodes and validate health:$ for OVSWORKER in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod on node $OVSWORKER" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pods -l app=ovs-node --field-selector spec.nodeName=$OVSWORKER -o name) ; sleep 3; \ done $ for OVNKUBENODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod on node $OVNKUBENODE" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNKUBENODE -o name) ; sleep 3; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide
Rebuild OVN on RHOCP v4.7
-
Confirm that a new revision of the
kube-apiserveris not in the process of being rolled out:$ oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}' -
(Optional) If using
oc debugpre-pulling the debug image will make this whole process faster:$ for NODE in $(oc get nodes -o name --no-headers); \ do echo "Debug pod on node $NODE" ; \ oc debug $NODE -- chroot /host /bin/bash -c 'echo hello' ; sleep 2; \ done -
Remove the
northboundandsouthbounddatabases and delete theovnkube-masterpods one-at-a-time:$ for OVNKUBEMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting databases for pod $OVNKUBEMASTER" ; \ oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNKUBEMASTER rm -f /etc/openvswitch/ovnnb_db.db /etc/openvswitch/ovnsb_db.db; \ done $ for OVNKUBEMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting pod $OVNKUBEMASTER" ; \ oc -n openshift-ovn-kubernetes delete pod --wait=false $OVNKUBEMASTER ; sleep 6; \ done -
Validate the health by confirming there's a Northbound and Southbound Leader. This can take a few minutes to recover. If there's multiple leaders for either NB or SB, that may be a split-brain and the first step must be performed again:
for OVNMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "········································" ; \ echo "· OVNKube Master: $OVNMASTER ·" ; \ echo "········································" ; \ echo 'North' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound | grep ^Role` ; \ echo 'South' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound | grep ^Role`; \ echo "····················"; \ done -
The output of the command above should look like this
········································ · OVNKube Master: ovnkube-master-xxx· ········································ North Role: leader South Role: leader ···················· ········································ · OVNKube Master: ovnkube-master-yyy ········································ North Role: follower South Role: follower ···················· ········································ · OVNKube Master: ovnkube-master-zzz ········································ North Role: follower South Role: follower ···················· -
Restart
OVSservices onmasters. It can be done viaoc debug node/${NODE}or via SSH:- If done via
oc debug node/${NODE}, run this:
$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o name --no-headers); \ do echo "Restarting OVS services on $MASTER" ; \ oc debug $MASTER -- chroot /host /bin/bash -c 'systemctl restart ovs-vswitchd ovsdb-server' ; sleep 3; \ done- If done via ssh, run this instead:
$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Restarting OVS services on $MASTER" ; \ ssh core@$MASTER 'sudo systemctl restart ovs-vswitchd ovsdb-server' ; sleep 3; \ done - If done via
-
Delete 'ovnkube-node' on
mastersand validate health$ for OVNNODEMASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting OVNKube-Node on master $OVNNODEMASTER" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNNODEMASTER -o name) ; sleep 4; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide -
Restart
OVSservices onnon-masternodes. It can be done viaoc debug node/${NODE}or via SSH- If done via
oc debug node/${NODE}, run this:
$ for OVNKUBENODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o name --no-headers); \ do echo "Restarting OVS services on node $OVNKUBENODE" ; \ oc debug $OVNKUBENODE -- chroot /host /bin/bash -c 'systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done- If done via ssh, run this instead:
$ for OVNKUBENODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Restarting OVS services on node $OVNKUBENODE" ; \ ssh core@$OVNKUBENODE 'sudo systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done - If done via
-
Delete
ovnkube-nodeonnon-master nodesand validate health$ for OVNKUBENODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting OVNKube-Node on node $OVNKUBENODE" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNKUBENODE -o name) ; sleep 4; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide
Rebuild OVN on RHOCP v4.8 - v4.13
-
Confirm that a new revision of the
kube-apiserveris not in the process of being rolled out.$ oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}' -
(Optional) If using
oc debugpre-pulling the debug image will make this whole process faster.$ for NODE in $(oc get nodes -o name --no-headers); \ do echo "Debug pod on node $NODE" ; \ oc debug $NODE -- chroot /host /bin/bash -c 'echo hello' ; sleep 2; \ done -
The databases are mounted on container
northdin the/etc/openvswitchdirectory and on containersnbdbandsbdbin/etc/ovndirectory, which both are hostPath mounts on/var/lib/ovn/etcon the masters. On the newer versions we will delete directly on the masters to ensure databases are recreated the same way on all masters, by accessing directly via ssh or withoc debug. -
Remove the
northboundandsouthbounddatabases. This can be done either by usingoc debug node/${NODE}or by using SSH:-
If done using
oc debug node/${NODE}, run these commands$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o name --no-headers); \ do echo "Deleting databases on master $MASTER" ; \ oc debug $MASTER -- chroot /host /bin/bash -c 'rm -f /var/lib/ovn/etc/*.db' ; sleep 3; \ done -
If done via SSH, run these commands instead
$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting databases on master $MASTER" ; \ ssh core@$MASTER 'sudo rm -f /var/lib/ovn/etc/*.db' ; sleep 3; \ done
-
-
Then delete the pods to force the restart:
$ oc -n openshift-ovn-kubernetes delete pod -l=app=ovnkube-master -
Validate the health by confirming there's a
northboundandsouthboundLeader. This can take a few minutes to recover. If there's multiple leaders for either NB or SB, that may be asplit-brainand the first step must be performed again:for OVNMASTER in $(oc -n openshift-ovn-kubernetes get pods -l app=ovnkube-master -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "········································" ; \ echo "· OVNKube Master: $OVNMASTER ·" ; \ echo "········································" ; \ echo 'North' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound | grep Role` ; \ echo 'South' `oc -n openshift-ovn-kubernetes rsh -Tc northd $OVNMASTER ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound | grep Role`; \ echo "····················"; \ doneThe output of the command above should look like this
········································ · OVNKube Master: ovnkube-master-xxx ········································ North Role: leader South Role: leader ···················· ········································ · OVNKube Master: ovnkube-master-yyy ········································ North Role: follower South Role: follower ···················· ········································ · OVNKube Master: ovnkube-master-zzz ········································ North Role: follower South Role: follower ···················· -
Optional: use
network-toolsin order to seeleaders. See https://github.com/openshift/network-tools/blob/master/docs/user.md#examples$ oc adm must-gather --image=quay.io/openshift/origin-network-tools:latest -- network-tools ovn-get leaders $ network-tools ovn-get leaders -
Restart
OVSservices onmasters. It can be done either viaoc debug node/${NODE}or via SSH:-
If done via
oc debug node/${NODE}, run this step$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o name --no-headers); \ do echo "Restarting OVS services on node $MASTER" ; \ oc debug $MASTER -- chroot /host /bin/bash -c 'systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done-
If done via SSH, run this step instead
$ for MASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Restarting OVS services on node $MASTER" ; \ ssh core@$MASTER 'sudo systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done
-
-
-
Delete
ovnkube-nodeon masters and validate health:$ for OVNNODEMASTER in $(oc get nodes -l node-role.kubernetes.io/master= -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting OVN-Kube Node on master $OVNNODEMASTER" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNNODEMASTER -o name) ; sleep 4; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide -
Restart
OVSservices onnon-master nodes. It can be done either viaoc debug node/${NODE}or via SSH:-
If done via
oc debug node/${NODE}, run this step$ for NODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o name --no-headers); \ do echo "Restarting OVS services on node $NODE" ; \ oc debug $NODE -- chroot /host /bin/bash -c 'systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done-
If done via SSH, run this step instead
$ for NODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Restarting OVS services on node $NODE" ; \ ssh core@$NODE 'sudo systemctl restart ovs-vswitchd ovsdb-server' ; sleep 2; \ done
-
-
-
Delete
ovnkube-nodeonnon-master nodesand validate health:$ for OVNKUBENODE in $(oc get nodes -l '!node-role.kubernetes.io/master' -o custom-columns=NAME:.metadata.name --no-headers); \ do echo "Deleting OVN-Kube Node on node $OVNKUBENODE" ; \ oc -n openshift-ovn-kubernetes delete --wait=false $(oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector spec.nodeName=$OVNKUBENODE -o name) ; sleep 3; \ done $ oc -n openshift-ovn-kubernetes get pods -o wide
Rebuild OVN on OCP 4.14 and later
The procedure below explains how to rebuild OVN databases in a single node. If you need to rebuild the OVN deployments of more than one node, just repeat the procedure for each node.
Steps are:
-
Confirm that a new revision of the
kube-apiserveris not in the process of being rolled out.$ oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}' -
The databases are stored in
/var/lib/ovn-ic/etcpath in the host, which is intentionally different from the one in previous versions. This path is mounted as a hostPath mount inside theovnkube-controller,nbdb,ovn-northd,sbdbandovn-controllernodes. -
Remove the
northboundandsouthbounddatabases from the node. This can be done either by usingoc debug node/${NODE}or by using SSH:- If done using
oc debug node/${NODE}, run these commands
$ oc debug node/${NODE} -- chroot /host /bin/bash -c 'rm -f /var/lib/ovn-ic/etc/ovn*.db'- If done via SSH, run these commands instead
$ ssh core@${NODE} 'sudo rm -f /var/lib/ovn-ic/etc/ovn*.db' - If done using
-
Restart
OVSservices on the node. It can be done either viaoc debug node/${NODE}or via SSH:- If done using
oc debug node/${NODE}, run these commands
$ oc debug node/${NODE} -- chroot /host /bin/bash -c 'systemctl restart ovs-vswitchd ovsdb-server'- If done via SSH, run these commands instead
$ ssh core@${NODE} 'sudo systemctl restart ovs-vswitchd ovsdb-server'- Optional: In telco or high performance clusters if valid performanceProfile CR exists, then
cpuset-configure.serviceneeds to be restarted otherwise ovs dynamic CPU pinning feature will stop working afterovs-vswitchdis restarted
$ ssh core@${NODE} 'sudo systemctl restart cpuset-configure.service' - If done using
-
Delete the ovnkube-controller pod on the node to force the restart
$ oc -n openshift-ovn-kubernetes delete pod -l app=ovnkube-node --field-selector=spec.nodeName=${NODE} -
Watch the ovnkube-controller pod on the node to check its health
$ oc -n openshift-ovn-kubernetes get pod -l app=ovnkube-node --field-selector=spec.nodeName=${NODE} -wWait until all the containers are ready, which can take several minutes.