Troubleshooting OpenShift Container Platform 4.x: machine-config operator

Updated

1. Check the machine-config operator clusteroperator resource for any errors which would indicate a problem with specific nodes or machineconfigpools.

$ oc describe clusteroperator machine-config

2. Checking status of nodes

$ for node in $(oc get nodes -o name | awk -F'/' '{ print $2 }');do echo "-------------------- $node ------------------"; oc describe node $node | grep machineconfiguration.openshift.io/state; done

3. Check machineconfigs

$ oc describe machineconfig

4. Check for any degraded machineconfigpools

$ oc get machineconfigpool

5. If there are degraded machineconfigpools, describe the failing pool:

$ oc describe machineconfigpool [the-failing-pool_name]

6. Collect the machine-config-daemon logs. Each node should have a machine-config daemon.

$ oc project openshift-machine-config-operator
$ for POD in $(oc get po -l k8s-app=machine-config-daemon -o name | awk -F '/' '{print $2 }'); do oc logs $POD -c machine-config-daemon > $POD.log; done

7. Collect the machine-config-operator pod logs

$ oc get pods -n openshift-machine-config-operator | grep machine-config-operator
$ oc logs -n openshift-machine-config-operator [machine-config-operator_podname] > machine-config-operator-pod.log

8. Collect events from the openshift-machine-config-operator namespace

$ oc get events -n openshift-machine-config-operator
SBR
Category
Article Type