rook-ceph-osd-X Pod Stuck in CrashLoopBackOff/init after Node Reboot/OCP Upgrade monclient(hunting) - OpenShift Data Foundation

Solution Verified - Updated 14 Oct 2025

Environment

Red Hat OpenShift Container Platform (RHOCP) v4.x
Red Hat OpenShift Data Foundations (RHODF) v4.x
Red Hat OpenShift Container Storage (RHOCS) v4.x

Issue

In situations where the Local Storage Operator (LSO) cannot use /dev/disk/by-id/ symlinks, users will sometimes incorrectly configure their localvolume/localvolumediscovery objects to use the non-persistent /dev/sdN or /dev/vdN device files. When this occurs, the path in the PV will reflect /dev/sdX/dev/vdX. This can cause issues when the device file names change (Ex. /dev/sda changes to /dev/sdb) during a node reboot or an OCP upgrade. After which the PV can no longer communicate with the proper symlink located in the /mnt/local-storage/<storageclass>/ directory on the node.

If you are on vsphere and disk.EnableUUID is not set to true, running $ ls -l /dev/disk/by-id UUID will NOT display any UUIDs. When this is the case please refer to the KCS How to check and set the disk.EnableUUID parameter from VM in vSphere for OpenShift Container Platform

2024-06-21T17:10:24.362235635Z debug 2024-06-21T17:10:24.361+0000 7fc951f22700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

Resolution

Preface:

It's important to understand that this is a workflow. Depending on the exact circumstances, there could be one OSD down, or many. If there are many OSDs down, a pod disruption budget or flags thrown by the rook-ceph-operator (nobackfill, norecover, etc.) have been instituted to safeguard data integrity.

In this workflow, since each node will have to be shutdown one at a time (while monitoring Ceph) to edit the VM, it will be important to initially get the OSDs up and running, allowing for rebalance, and for Ceph to return HEALTH_OK all PGs active+clean, then finally fix the UUIDs/Symlinks on each node.

Warning: Work one node at a time, DO NOT proceed to the next node, until ALL OSDs on the previous node are up/running. Losing too many OSDs could jeopardize data integrity (data loss).

With a notepad tool, run the following commands and capture the data/begin sorting OSD IDs, PVs, PVCs, hostnames, and paths of the devices.

$ oc get pod -n openshift-storage -o 'custom-columns=NAME:.metadata.labels.ceph-osd-id,PVCNAME:.spec.volumes[*].persistentVolumeClaim' | grep -v none

Example:

NAME     PVCNAME
0        map[claimName:ocs-deviceset-0-data-089lnk]
1        map[claimName:ocs-deviceset-0-data-1sr5dx]
2        map[claimName:ocs-deviceset-1-data-0nk5m5]

$ oc get pv -o 'custom-columns=HOST-NAME:.metadata.labels.kubernetes\.io/hostname,PVCNAME:.spec.claimRef.name,SYMLINK:.spec.local.path,DEV-NAME:.metadata.annotations.storage\.openshift\.com/device-name' | grep -v none

Example:

HOST-NAME       PVCNAME                             SYMLINK                                   DEV-NAME
node.1.com      ocs-deviceset-0-data-1sr5dx         /mnt/local-storage/localblock/sdb         sdb
node.2.com      ocs-deviceset-0-data-089lnk         /mnt/local-storage/localblock/sdb         sdb
node.3.com      ocs-deviceset-1-data-0nk5m5         /mnt/local-storage/localblock/sdb         sdb

Set up two CLI Terminal windows for commands and monitoring.

NOTE: Depending on how many OSDs are provisioned on this node, this may be a trial/error task where a symlink command will be issued to correct the symlink followed by a pod deletion to test if the correct symlink is in place. It is also helpful to have three CLI terminal windows up (one in the current node, one to run oc commands, and one to monitor Ceph). To run Ceph commands, see the Configuring the Rook-Ceph Toolbox in OpenShift Data Foundation 4.x solution article.

Commands to monitor Ceph once rsh'd into the rook-ceph-tools pod:

$ ceph status  #<------------- monitor PGs and OSDs

$ ceph osd tree $<------------ better look at the OSDs/hosts

Note: In a given scenario where some OSDs are up and some are down (shown in the outputs using the command above), we can determine that the OSDs that are up/in, have the CORRECT path in the PV! That device did not shift names.

Example of an OSD that is up/running:

$ oc get pods -n openshift-storage | grep rook-ceph-osd | grep Running
NAME                                                              READY   STATUS    RESTARTS   AGE
rook-ceph-osd-0-567869c6c4-5b56t                                  2/2     Running   0          76m

$ oc describe pod -n openshift-storage rook-ceph-osd-0-567869c6c4-5b56t | grep -i claim
ClaimName:  ocs-deviceset-0-data-089lnk

$ oc get pv -n openshift-storage | grep ocs-deviceset-0-data-089lnk
local-pv-9a68ad40                          1TiB      RWO            Delete           Bound    openshift-storage/ocs-deviceset-0-data-089lnk   localblock

$ oc describe pv -n openshift-storage local-pv-9a68ad40 | grep -i path

Path:  /mnt/local-storage/localblock/sdc <--- WE CAN RULE THIS OUT. THE OSD IS UP/RUNNING. THE SYMLINK TO /dev/sdc IS CORRECT (DON'T TOUCH).

Debug into the first node and correct the symlink(s) on the first node.

$ oc debug node/node.1.com
$ chroot /host

Depending on how many devices are on the host will make this easier/more difficult, however, a good command to run to at least start to distinguish the devices is $ lsblk.

Note: OSD devices will likely be the devices with no underlying/nested partitions and will usually (not always) be larger devices such as 500GiB, 1TiB, 2TiB, etc.

Example:

$ lsblk
sda    259:5    0  1T  0 disk <------ device shifted from sdb to sda

Once it's been determined where the device went/may have gone, navigate to the LSO symlink folder.

$ cd /mnt/local-storage/<storageclass-name>/

$ ls -ltr

sh-5.1# ls -ltr

lrwxrwxrwx. 1 root root 68 Apr 26 18:32 sdb -> /dev/sdb <------------- symlink

Now that we have confirmed the device shifted to /dev/sda in our example above, attempt to correct the symlink.

$ cd /mnt/local-storage/<storageclass-name>/
$ ln -sf /dev/sda sdb

Delete the pod.

$ oc delete pod -n openshift-storage rook-ceph-osd-X-<pod-name>

NOTE: If the pod comes up/running the symlink correction was successful. However, as discussed previously, if there are many OSDs on the host this may transition to a trial/error process. If this is the case repeat steps 4-7 until success is finally achieved.

Once all OSDs are up/running on the first node, proceed to the next node and repeat steps 3-7, and for the remaining nodes, only proceeding to the next node when all OSDs on the current node being worked are up/Running.
After all OSDs on all the Nodes are running, monitor the rebalance in Ceph and don't proceed to bring down the nodes to enable UUID until all PGs are active+clean.

$ oc rsh -n openshift-storage rook-ceph-tools-<pod-name>

$ ceph status

    health: HEALTH_OK <------------------------------------- Needs to be before bringing down the first node
 
  services:
    mon: 3 daemons, quorum a,b,c (age 84m)
    mgr: b(active, since 84m), standbys: a
    mds: 1/1 daemons up, 1 hot standby
    osd: 6 osds: 6 up (since 84m), 6 in (since 84m)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 209 pgs
    objects: 99 objects, 154 MiB
    usage:   421 MiB used, 600 GiB / 600 GiB avail
    pgs:     209 active+clean <---------------------------- Needs to be before bringing down the first node
 
  io:
    client:   1.2 KiB/s rd, 2.7 KiB/s wr, 2 op/s rd, 0 op/s wr

Once rebalancing is finished (Step 9), begin enabling disk.EnableUUID=TRUE and uuid.action=keep on each host, one at a time, and only proceed to the next node once Ceph reflects the example in Step 9. To gracefully bring down a storage node, follow the steps to scale down any mon/osd on the node being worked on. Additionally, delete any noobaa pod on that node once it's been cordoned. See How to safely reboot an OCS/ODF 4 node solution for more info. Lastly, DO NOT forget to scale up the mons/OSDs once the node has been uncordoned and in Ready status.
The following documentation will assist in this process (Since this is Infrastructure Specific, relevant VMWare docs will be provided, however it is up to the user to pursue the most up-to-date documentation regarding their virtual environment):

Content from kb.vmware.com is not included.VMWare: Cannot find "disk.EnableUUID = "TRUE"" parameter in vmx file

Content from kb.vmware.com is not included.VMWare: Changing or keeping a UUID for a moved virtual machine

How to check and set the disk.EnableUUID parameter from VM in vSphere for OpenShift Container Platform

Use of VMware VMotion with OpenShift Container Storage / OpenShift Data Foundation

Once UUIDs have been enabled ON THE FIRST NODE (devices may shift again), you can now run $ ls -l /dev/disk/by-id in the node and begin seeing UUIDs. Both scsi UUIDs and wwn UUIDs should be present. Although scsi IDs were used frequently in prior versions of LSO. The new UUIDs LSO defaults to are the new wwn UUIDs. Although either will work, when fixing symlinks use the wwn UUIDs.
To begin correcting the symlinks to the UUID, this may be a trial and error process again, however, a helpful command in Ceph that may reveal which UUID is associated with which OSD, is the following command:

$ oc rsh -n openshift-storage rook-ceph-tools-<pod-name>

Example for osd.0. Change ID number based on desired OSD:

$ ceph osd metadata 0

<omitted-for-space>
    "device_ids": "wwn-0x6000c29a28e6d9e51bf43c386e40d6b9", <---- does this have a UUID? If not continue with trial/error (steps 3-7)

Correct the symlink.

Example:

$ oc debug node/node.1.com

$ chroot /host

$ lsblk

$ ls -l /dev/disk/by-id

$ cd /mnt/local-storage/<storageclass-name>/

ln -sf /dev/disk/by-id/wwn-0x6000c29a28e6d9e51bf43c386e40d6b9 sdb

Delete the pod.

$ oc delete pod -n openshift-storage rook-ceph-osd-X-<pod-name>

NOTE: If the pod comes up/running the final symlink correction was successful. Once corrected on all OSDs, the node will finally be able to survive reboots. Repeat on remaining nodes while monitoring Ceph.

Root Cause

Without UUID enabled in virtualized environment LSO defaults to using device IDs. For example /dev/sdb, /dev/sdc/, etc. which can shift upon node reboots.

Diagnostic Steps

$ oc get get pods -n openshift-storage | grep rook-ceph-osd
NAME                                                              READY   STATUS             RESTARTS   AGE
rook-ceph-osd-0-bd9dd975f-bt2k6                                   2/2     Running            6          99d
rook-ceph-osd-1-64d4b849f4-llmhz                                  1/2     CrashLoopBackOff   13         2h
rook-ceph-osd-2-77c6965975-sdx5b                                  2/2     Running            0          46m
rook-ceph-osd-3-67665f9d4-h9d75                                   1/2     CrashLoopBackOff   15         2h
rook-ceph-osd-4-f985555f9-nf2pg                                   2/2     Running            4          99d
rook-ceph-osd-5-5748b4cdc-tqvmh                                   2/2     Running            4          99d

$ oc logs -n openshift-storage -f rook-ceph-osd-1-64d4b849f4-llmhz

2024-06-21T17:10:24.362235635Z debug 2024-06-21T17:10:24.361+0000 7fc951f22700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]


$ oc describe pod -n openshift-storage rook-ceph-osd-1-64d4b849f4-llmhz | grep -i claim
ClaimName:  ocs-deviceset-0-data-089lnk

$ oc get pv -n openshift-storage | grep ocs-deviceset-0-data-089lnk
local-pv-9a68ad40                          1TiB      RWO            Delete           Bound    openshift-storage/ocs-deviceset-0-data-089lnk   localblock

$ oc describe pv -n openshift-storage local-pv-9a68ad40 | grep -i path

Path:  /mnt/local-storage/localblock/sdc <------------------ Not UUID

SBR

Product(s)

Red Hat OpenShift Data Foundation

Components

Storage

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.