Cluster Updates Without Error but Machine Config Pools Degraded with `Marking Degraded due to: unexpected on-disk state` on OCP 4.6 and newer
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.6+
Issue
-
After performing an update to a newer version of OpenShift Container Platform, not all nodes are upgraded. For example:
$ oc get node NAME STATUS ROLES AGE VERSION master-0.ocp.example.net Ready master 34d v1.17.1+9d33dd3 master-1.ocp.example.net Ready master 34d v1.17.1+9d33dd3 master-2.ocp.example.net Ready master 34d v1.17.1+9d33dd3 worker-0.ocp.example.net Ready worker 34d v1.17.1+9d33dd3 worker-1.ocp.example.net Ready worker 34d v1.17.1+9d33dd3 worker-2.ocp.example.net Ready, SchedulingDisabled worker 34d v1.17.1+912792b <---------- -
After performing an update to a newer version of OpenShift Container Platform, the MachineConfigOperator is reporting degraded pools:
$ oc describe co/machine-config ... 'Failed to resync $VERSION because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool $POOL is not ready, retrying. Status: (pool degraded: true total: x, ready y, updated: y, unavailable: 1)]' -
A machine config pool is degraded, and in the MachineConfigOperator clusteroperator extensions, we see an error similar to:
worker: 'pool is degraded because nodes fail with "1 nodes are reporting degraded status on sync": "Node worker0 is reporting: \"unexpected on-disk state validating against rendered-worker-abc: expected target osImageURL \\\"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:xxx\\\", have \\\"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:yyy\\\" (\\\"zzz\\\")\""' -
The machine-config-daemon pod logs show:
Marking Degraded due to: unexpected on-disk state validating against rendered-master- 7eee653a0a756d9bb2eb74f2ea00b91e: content mismatch for file "/usr/local/bin/configure-ovs.sh"
Resolution
Important: This solution is for OCP 4.6 and newer releases, and has now been updated for 4.12+. Previous revisions of this solution referenced using the deprecated pivot command. For OCP 4.5 and older, please check KCS 4466631 instead.
Before applying the workaround, please collect the logs from the affected node to assist in finding the root cause:
$ oc debug node/[node_name]
[...]
sh-4.4# chroot /host bash
[core@node_name ~]# journalctl -b -1 -u ostree-finalize-staged.service
[core@node_name ~]# journalctl -b -1 -u rpm-ostreed.service
Collecting a must-gather would also be helpful. In newer versions of the must-gather tool, the above services are automatically collected.
You can also collect a sosreport from the failing node/nodes, which would contain the above service logs.
These logs should tell you why the attempted OS upgrade did not succeed. Since the error is varied and changes from version to version, please check the Root Cause below for details.
Workaround 1
Note: for ARO or OCP clusters installed on Azure, please refer to KCS 6522771 before applying the workaround.
If we determine the issue to be transient, we can retry the OS image update by performing the following steps on the node:
- Access the failing node:
$ oc debug node/[node_name]
sh-4.4# chroot /host
- Delete the currentConfig file on-disk
sh-4.4# rm /etc/machine-config-daemon/currentconfig
- Tell the MCD to forcefully retry the update and ignore the current validation error
sh-4.4# touch /run/machine-config-daemon-force
The MCD should now retry an update, and the node should reboot. In case it did not, check the machine-config-daemon logs to see what went wrong. Before touching the forcefile, you can also first follow the MCD logs on another console, which may aid in debugging efforts if something goes wrong again.
oc logs -n openshift-machine-config-operator -c machine-config-daemon machine-config-daemon-xxxxx -f
Workaround 2
If Workaround 1 didn't work, it is possible to oc rsh into the machine-config-daemon Pod on the problematic node and then run the following command to have the liveness probe succeed so that the Pod doesn't get killed and the force update can complete. Then run Workaround 1 again.
cat <<EOF > http.sh
printf "HTTP/1.1 200 OK\r\n"
printf "Content-Length: 0\r\n"
printf "\r\n"
EOF
chmod +x http.sh
socat TCP-LISTEN:8798,fork,reuseaddr EXEC:'./http.sh'
Workaround 3
It is also possible to recreate the files manually; see This KCS for details.
Root Cause
The Machine Config Operator component in charge of managing each individual node is the Machine Config Daemon (MCD), which runs as a daemonset in the openshift-machine-config-operator namespace.
If the system state differs in any way from what it expects, it sets the MachineConfigPool as Degraded and also reflects that in the machineconfiguration.openshift.io/state node annotation. This error then bubbles up to the Machine-Config-Operator ClusterOperator status. The MCD then stops taking any action until the current issue is fixed in order to prevent any further breakage. Reason for the degradation is usually explained in the machine-config-daemon container logs.
One of the configuration differences that can cause that validation failure is when the node is running an incorrect ostree image. The machine-config-daemon detects this and reports it in the way described in the diagnostic steps. It is unusual that this happens, unless a bug or an abnormal situation is hit or manual changes are performed in the nodes.
The most common case is that a cluster would encounter this during an upgrade. When individual Machine-Config-Daemons attempt to upgrade the OS, it does not directly do so, but instead stages the incoming OS update via rpm-ostree. After the Machine-Config-Daemon initiates a reboot, rpm-ostree attempts to perform the actual OS update. The update is transactional, meaning that if the update fails, we simply stay on the old OS version. The Machine-Config-Daemon at this point, however, has already shut down, so it will not know of the failure until the node has rebooted and the Machine-Config-Daemon runs again, attempts to validate the image, and fails. The Machine-Config-Daemon does not actually know why the failure occurred, which is why rpm-ostreed and ostree-finalize-staged logs are needed.
The Resolution Steps of this solution explain how to force upgrade the node again, which can help unblock the upgrade if the original underlying issue was transient. However, if the issue persists, we would need to find the root cause of the issue via the journal logs, and then attempt remediation based on the issue.
Diagnostic Steps
- Check to see if any
MachineConfigPoolsare degraded:
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-c58240ee462345a9375360cd7a78443d True False False 3 3 3 0 34d
worker rendered-worker-49131c85db4e2ee452b4eaafcc566ca9 False True True 3 2 2 1 34d
- If there are, the logs for the
machine-config-daemonpods need to be checked for instances of incorrect osImageURL states.
$ oc project openshift-machine-config-operator
$ MCD_PODS=`oc get pod -o=jsonpath='{.items[*].metadata.name}' -l k8s-app=machine-config-daemon`
$ for POD in $MCD_PODS;do echo ----;echo Checking for osImageURL mismatch; oc get pod $POD -o wide; oc logs $POD -c machine-config-daemon | grep "expected target osImageURL"; done;
Output should look like the following:
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-5p2zv 2/2 Running 1 5d20h 10.74.178.208 worker-0.ocp.example.net <none> <none>
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-h7wdf 2/2 Running 1 5d20h 10.74.178.190 worker-2.ocp.example.net <none> <none>
E0706 11:25:56.146529 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:25:58.203841 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:26:06.244588 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:26:22.294623 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:26:54.331769 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:27:54.374431 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
E0706 11:28:54.416390 2408 daemon.go:1186] expected target osImageURL quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-p7b6s 2/2 Running 1 5d20h 10.74.178.201 master-1.ocp.example.net <none> <none>
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-qrmvx 2/2 Running 1 5d20h 10.74.178.192 master-0.ocp.example.net <none> <none>
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-rhjn5 2/2 Running 1 5d20h 10.74.178.214 worker-1.ocp.example.net <none> <none>
----
Checking for osImageURL mismatch
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
machine-config-daemon-spvkf 2/2 Running 1 5d20h 10.74.178.145 master-2.ocp.example.net <none> <none>
In the example, it complains that the osImageURL run by the system should be quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:328a1e57fe5281f4faa300167cdf63cfca1f28a9582aea8d6804e45f4c0522a8, but it isn't.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.