Degraded machine-config Cluster Operator due to MachineConfigPool being paused in OpenShift 4
Environment
- Red Hat Openshift Container Platform (RHOCP)
- 4
- Machine Config Operator (MCO)
MachineConfigPool
Issue
-
MachineConfigPoolsare paused, preventing the Machine Config Operator to push out updates in OpenShift 4. -
The
machine-configClusterOperatorhas messages like:timed out waiting for the condition during syncRequiredMachineConfigPools:pool master has not progressed to latest configuration: controller version mismatch
Resolution
Check the value of the paused field in the MachineConfigPools as shown in the Diagnostic Steps, and change them to false:
$ oc patch mcp [mcp_name] --type=merge -p '{"spec": {"paused": false}}'
There is an RFE (This content is not included.RFE-1993) to alert before starting an upgrade if the master MCP is paused.
Root Cause
The MachineConfigPools can not be applied with newly rendered configs because they were paused: true. This caused the Content from github.com is not included.controller version check failed because the machine config pools are still using the Content from github.com is not included.old rendered configs, which will never be generated by the newly installed Machine Config Operator.
This change is generally updated by an administrator; possibly to avoid any updates occurring without their knowledge. The cluster would not set this value to true. Refer to Disable autoreboot after a change with the machine-config-operator in OCP 4 for additional information.
Diagnostic Steps
Check the upgrade status:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.3.9 True True 13h Working towards 4.3.26: 84% complete
Check if the machine-config ClusterOperator is available or degraded, and check the status:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
[...]
machine-config 4.3.9 False True True 5h
[...]
$ oc get co machine-config -o yaml
[...]
message: ‘Unable to apply 4.3.26: timed out waiting for the condition during syncRequiredMachineConfigPools:
pool master has not progressed to latest configuration: controller version mismatch
[...]
lastSyncError: ‘pool master has not progressed to latest configuration: controller
version mismatch
[...]
Check if the desired and current rendered configuration MachineConfig for each MachineConfigPool is the same, or if anyone has different desired and current configuration:
$ oc get mcp -o custom-columns=NAME:metadata.name,DESIRED:spec.configuration.name,CONFIG-STATUS:status.configuration.name
[...]
Check the MachineConfigPool referenced in machine-config Cluster Operator messages, to verify if any are set to paused:
$ oc get mcp [mcp_name] --template='{{.spec.paused}}'
$ oc get mcp [mcp_name] -o yaml
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.