rook-ceph-mon Pods Stuck "Pending" during ODF Upgrade - Missing topologyKey: "" for the rook-ceph-mon podAntiAffinity in the ocs-cephcluster CR - OpenShift Data Foundation

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP) v4.x
  • Red Hat OpenShift Data Foundation (RHODF) v4.x

Issue

It has been discovered that some cephcluster CRs in the ODF may be impacted during an upgrade if there is a missing topologyKeyvalue for the rook-ceph-mon pods podAntiAffinity:

$ oc get cephcluster -n openshift-storage ocs-storagecluster-cephcluster -o yaml
          topologyKey: ""  <---missing
$ ceph versions
     "mon": {
        "ceph version 18.2.1-361.el9cp (439dcd6094d413840eb2ec590fe2194ec616687f) reef (stable)": 3 <-- Won't upgrade
    },
    "mgr": {
        "ceph version 19.2.1-292.el9cp (ba02d589f9356be88303b8e8ec2790f12300f3b5) squid (stable)": 2
    },
    "osd": {
        "ceph version 19.2.1-292.el9cp (ba02d589f9356be88303b8e8ec2790f12300f3b5) squid (stable)": 15
    },
    "mds": {
        "ceph version 18.2.1-361.el9cp (439dcd6094d413840eb2ec590fe2194ec616687f) reef (stable)": 2
    },
    "overall": {
        "ceph version 18.2.1-361.el9cp (439dcd6094d413840eb2ec590fe2194ec616687f) reef (stable)": 5,
        "ceph version 19.2.1-292.el9cp (ba02d589f9356be88303b8e8ec2790f12300f3b5) squid (stable)": 17
    }
$ ceph status

    health: HEALTH_WARN
            1/3 mons down, quorum d,e
 
  services:
    mon: 3 daemons, quorum d,e (age 3h), out of quorum: b
rook-ceph-mon-b-7f669f9cbf-n8j55                                  0/2     Pending     0          1h  

Resolution

  1. Backup the storageclusterCR:
$ oc get storagecluster -o yaml -n openshift-storage > storagecluster.yaml_backup
  1. Scale down the ocs-operator deployment:
$ oc -n openshift-storage scale deployment ocs-operator --replicas=0
  1. Edit the storageCluster CR status to remove the Failure domain, failureDomain, nodeTopologies, failureDomainValues & the failureDomainKey values (where applicable). This will allow the OCS operator to re-determine and reconcile these values once scaled back up.

Note: The last two commands will only succeed if present. Continue to the next step after running all four commands.

$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/failureDomain"}]'

$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/nodeTopologies"}]' 

$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/failureDomainKey"}]'

$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/failureDomainValues"}]' 
  1. Scale up the ocs-operator:
$ oc -n openshift-storage scale deployment ocs-operator --replicas=1
  1. Allow ~5 minutes to pass and re-check the storagecluster and cephcluster CRs:
$ oc get storagecluster -n openshift-storage -o yaml

$ oc get cephcluster -n openshift-storage ocs-storagecluster-cephcluster -o yaml | grep -A8 podAntiAffinity

      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - rook-ceph-mon
          topologyKey: topology.kubernetes.io/rack <------ EXAMPLE ONLY, YOURS MAY DIFFER

Root Cause

ODF versions bundled with Rook v1.15 or higher, a change was introduced in the ClusterController logic that strictly enforces nodeAffinityand podAntiAffinity, whereas previous versions were more permissive of placement violations.

Diagnostic Steps

$ oc get cephcluster -n openshift-storage ocs-storagecluster-cephcluster -o yaml | grep -A8 podAntiAffinity

      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - rook-ceph-mon
          topologyKey: "" 
SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.