OpenShift pod stays in Pending status

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 3.11
    • 4
  • Kube-Scheduler

Issue

  • A specific pod gets into Pending status with no messages indicating what's wrong in the Events or in the Scheduler logs.
  • Deployment has not failures, but the application pod doesn't start and there're no logs.
  • Worker node has other application pods running.

Resolution

Check if there is a wrong configuration for the schedulerName or the configuration is pointing to a non-existing scheduler as explained in the "Diagnostic Steps" section, and fix it in the Deployment or DeploymentConfiguration of the application pod and remove the extra argument in the schedulerName.

Example of wrong schedulerName configuration:

[...]
    spec:
      containers:
      - envFrom:
        [...]
        image: docker-registry.default.svc:5000/[image_name]@sha256:[sha_256]
        [...]
      restartPolicy: Always
      schedulerName: default-scheduler securityContext {}            <<<<<<<<<< Here
      securityContext: {}
      terminationGracePeriodSeconds: 30
      [...]

The change should trigger a new rollout of the application.

To avoid this issue in new deployments due to errors or after a scheduler is deleted, using the right schedulers can be enforced creating a ValidatingAdmissionPolicy.

Root Cause

A wrong configuration for the schedulerName in the DeploymentConfiguration or Deployment could cause the Pod to not being scheduled in any node, and to not show any scheduling error.

This issue is caused by the way the scheduler works:

  1. The schedulers in the cluster have watchers open against the Kube APIserver to get informed of any pod that is created, updated or deleted. The watchers receive events like the following one:

{"type":"ADDED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"example-failing-fakescheduler","namespace":"test-fakescheduler","uid":"64944986-686c-4d10-abec-aed87fbe1acb","resourceVersion":"261774","creationTimestamp":"2025-09-15T14:18:38Z".........."schedulerName":"fakescheduler"



2. When each scheduler receives those events through the watcher, it checks if the `schedulerName` in the pod is the same as the name of the scheduler. This check is done [in the Kube-scheduler event handler](https://github.com/kubernetes/kubernetes/blob/b40d570248bcdf9ee390df51ecf60de8098314a0/pkg/scheduler/eventhandlers.go#L368-L371).

3. After the `schedulerName` has been compared to the current scheduler, the scheduler [checks if the scheduler is responsible for that pod and if the pod is unassigned](https://github.com/kubernetes/kubernetes/blob/b40d570248bcdf9ee390df51ecf60de8098314a0/pkg/scheduler/eventhandlers.go#L440-L458).

This leads to a situation where if there is a typo on the `schedulerName` (or a scheduler is removed from the cluster but the pod not changed the `schedulerName` and needs to be re-scheduler later), the pod is left in `Pending` status leaving no traces behind at all.

If the scheduler is created afterwards, then the pod is scheduled as the watcher gets an initial list of pods.

## Diagnostic Steps

- Check there are no info about scheduling issue in the pods, only the "`Pending`" `phase`:


$ oc get pod [failing_pod_name] -n [namespace_name] -o json | jq -r '.status'
{
  "phase": "Pending",
  "qosClass": "[qosClass_assigned]"
}
```
  • Check there are no info about failing pod in the kube-scheduler logs (refer to how to find the scheduler decisions in Openshift for additional information about the kube-scheduler logs):

    $ oc get pods -l "app=openshift-kube-scheduler" -n openshift-kube-scheduler
    [...]
    $ oc logs -n openshift-kube-scheduler -c kube-scheduler [kube-scheduler-pod_name] | grep [failing_pod_name]
    #### nothing shown
    
  • Check the schedulerName configured for the Deployment or DeploymentConfiguration and Pod:

        $ oc get deployment [deployment_name] -n [namespace_name] -o yaml | grep "`schedulerName`"
              schedulerName:: [wrong_scheduler_name]
    
        $ oc get deploymentconfig [deployment_name] -n [namespace_name] -o yaml | grep "`schedulerName`"
              schedulerName:: [wrong_scheduler_name]
    
        $ oc get pod [failing_pod_name] -n [namespace_name] -o yaml | grep "`schedulerName`"
            schedulerName: [wrong_scheduler_name]
    
SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.