machine-config-nodes-crd-cleanup pod in pending state during upgrade from 4.18 to 4.19 in RHOCP4

Solution Verified - Updated 29 Dec 2025

Environment

Red Hat OpenShift Container Platform (RHOCP)
- Upgrades from 4.18 to 4.19.12 or 4.19.13

Issue

RHOCP upgrade to 4.19 is stuck due to machine-config-nodes-crd-cleanup pod in openshift-machine-config-operator namespace stuck in pending state:

$ oc get clusterversion -o yaml 
[...]
    - lastTransitionTime: "2025-09-29T09:30:03Z"
      message: 'Could not update customresourcedefinition "machineconfignodes.machineconfiguration.openshift.io"
        (785 of 924): the object is invalid, possibly due to local cluster configuration'
      reason: UpdatePayloadResourceInvalid
      status: "True"
[...]

Resolution

The issue has been identified as a bug by Red Hat engineering team and is being actively worked upon. It can be tracked by:

Target Minor Release	Bug	Fixed Version	Errata
4.20	This content is not included.OCPBUGS-62073	4.20.0	RHSA-2025:9562
4.19	This content is not included.OCPBUGS-62114	4.19.14	RHBA-2025:16693

In addition to the above, when This content is not included.OCPBUGS-62321 is fixed, control plane nodes will be automatically labeled with the required label when updating to an OpenShift 4.18 version including this fix.

Workaround

When facing the issue upgrading to any of the affected versions, please add the node-role.kubernetes.io/control-plane label as mentioned below to all the control plane nodes in order to make sure that the upgrade proceeds as expected:

$ oc label node -l node-role.kubernetes.io/master node-role.kubernetes.io/control-plane=

Root Cause

This is observed with clusters that were installed prior to RHOCP version 4.12, where the nodes were not having the role Control Planeas a result the pod machine-config-nodes-crd-cleanup goes in pending state because the NodeSelector on the pod is using the label Control Plane. There is additional information about this in inconsistency of node-role between newly created vs. long running OpenShift 4 clusters.

Diagnostic Steps

Verify the state of the pod machine-config-nodes-crd-cleanup in openshift-machine-config-operator namespace:

    $ oc get pods -o wide | grep -i machine-config-nodes-crd-cleanup

    machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx                   0/1     Pending   0          3d    <none>         <none>                                       <none>           <none>

    $ oc get events | grep -i machine-config-nodes-crd-cleanup

     3h          Warning   FailedScheduling            pod/machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx                   0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
    3h          Warning   FailedScheduling            pod/machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx                   0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
    3h          Warning   FailedScheduling            pod/machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx                   0/6 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/not-ready: }, 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
    4m          Warning   FailedScheduling            pod/machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx                   0/6 nodes are available: 6 node(s) didn't match Pod's node affinity/selector. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.

Check the NodeSelector for the pod in Pending State:

$ oc get pods machine-config-nodes-crd-cleanup-xxxxxxxx-xxxxx
[...]
      imagePullSecrets:
      - name: machine-config-operator-dockercfg-xxxxx
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""       ## <- This label is missing on the Master / Control Plane Nodes
      preemptionPolicy: PreemptLowerPriority
[...]

Check the labels on the Master / Control Plane Nodes:

$ oc get nodes -l node-role.kubernetes.io/master -o yaml | grep -i control-plane

SBR

Shift Install Upgrade

Product(s)

Red Hat OpenShift Container Platform

Components

upgrade

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.