How to find the scheduler decisions in Openshift 4?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Scheduler
Issue
- How to find the logs messages generated by
default-schedulercomponent to understand the scheduler decisions from Openshift platform?
Resolution
The kube-scheduler Cluster Operator deploys the pods to manage the scheduling algorithm which will determine the node where the pods from the cluster will be installed. In common architectures, there is one scheduler pod installed per Control Plane node. Usually, the scheduler pods provide useful logs messages to assist on scheduling troubleshooting. Following the suggested steps to find this useful information.
Check the scheduler decisions
-
Increase the log verbosity of
kube-schedulertoDebug(some times, it could be required to set it toTraceorTraceAll) and wait for the new revision of thepods:$ REVISION=$(oc get kubescheduler -o jsonpath='{range .items[*]}{.status.latestAvailableRevision}') $ oc patch kubeschedulers.operator/cluster --type=json -p '[{"op": "replace", "path": "/spec/logLevel", "value": "Debug" }]' $ while [[ $(oc get kubescheduler \ -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.message}{"\n"}') \ != "3 nodes are at revision $(($REVISION + 1))" ]]; \ do echo "Waiting new revision";sleep 20;done -
After applying the command below, the leader
podname will be added to the variablePOD_SCHEDULER_LEADER:$ for POD_KUBE_SCHEDULER in $(oc get pod -n openshift-kube-scheduler -l app=openshift-kube-scheduler \ -o custom-columns=NAME:.metadata.name --no-headers); do if [[ $(oc -n openshift-kube-scheduler logs \ $POD_KUBE_SCHEDULER | grep -i 'successfully acquired lease openshift-kube-scheduler/kube-scheduler') == \ *successfully* ]]; then echo ""; POD_SCHEDULER_LEADER=$POD_KUBE_SCHEDULER;fi;done -
The scheduler decisions messages are generated under current leader
schedulerpod. Use the command below to
find these messages:$ oc logs $POD_SCHEDULER_LEADER -n openshift-kube-scheduler -c kube-scheduler -
Check examples of the messages that are provided by the logs of the
schedulerpod. These messages are
useful to identify thedefault-schedulerdecisions in more detail:I0321 17:50:54.039177 1 scheduling_queue.go:957] "About to try and schedule pod" pod="<project- name>/<pod-name>" I0321 17:50:54.042278 1 default_binder.go:52] "Attempting to bind pod to node" pod="<project-name>/<pod- name>" node="<node-name>" I0321 17:50:54.051880 1 schedule_one.go:266] "Successfully bound pod to node" pod="<project- name>/<pod-name>" -
After the verification, configure the
schedulinglog verbosity to normal as the default:$ oc patch kubeschedulers.operator/cluster --type=json -p '[{"op": "replace", "path": "/spec/logLevel", "value": "Normal" }]'
Note: if there are pods in
Pendingstatus with no info about scheduling issues and no related logs in theschedulerpodlogs, refer to OpenShiftpodstays inPendingstatus.
Root Cause
When the nodes or workloads from the cluster are scaled or removed, usually new pods are scheduled to the nodes. Understand better the scheduler decisions of the default-scheduler is useful to perform a more detailed troubleshooting of the scheduling process.
Diagnostic Steps
-
Check the
schedulerconfiguration:$ oc get scheduler cluster -o yaml [...] -
Check the log level configured for the
kubescheduler:$ oc get kubeschedulers -o yaml | grep -i "logLevel" logLevel: Normal operatorLogLevel: Normal
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.