ODF: Changes in crush rules due to device class changes results in PGs being in an unknown, misplaced, and/or remapped state
Environment
Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Issue
- Pools are utilizing a crush rule that references a device class that is not currently utilized in the cluster, this results in PGs being in an unknown, misplaced, and/or remapped state
- More than 100 percent of objects are misplaced in Ceph after upgrading the OpenShift Data Foundations operator to 4.16.x
- Rook created crush rules to utilize a device class that is not used in the cluster; these crush rules are now applied to pools, causing issues
- The
defaultCephDeviceClassin the StorageCluster CR is incorrect - Transitioned from unsupported HDDs to SSDs in ODF, resulting in data unavailability
Example:
Ceph status shows a large amount of objects misplaced
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph status -c /var/lib/rook/openshift-storage/openshift-storage.config
cluster:
id: [REDACTED]
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,d (age 7d)
mgr: b(active, since 7d), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 7d), 6 in (since 4M); 169 remapped pgs
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 0/1 healthy, 1 recovering
pools: 12 pools, 281 pgs
objects: 184.82k objects, 265 GiB
usage: 1.2 TiB used, 17 TiB / 18 TiB avail
pgs: 33.808% pgs unknown
886324/554472 objects misplaced (159.850%)
138 active+clean+remapped
95 unknown
48 active+undersized+remapped
All OSDs are utilizing the device class ssd
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd tree -c /var/lib/rook/openshift-storage/openshift-storage.config
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.85956 root default
-5 1.95319 host [REDACTED]
1 ssd 0.97659 osd.1 up 1.00000 1.00000
5 ssd 0.97659 osd.5 up 1.00000 1.00000
-7 1.95319 host [REDACTED]
2 ssd 0.97659 osd.2 up 1.00000 1.00000
4 ssd 0.97659 osd.4 up 1.00000 1.00000
-3 1.95319 host [REDACTED]
0 ssd 0.97659 osd.0 up 1.00000 1.00000
3 ssd 0.97659 osd.3 up 1.00000 1.00000
The pools are utilizing a crush rule that references hdd while there are only ssds in the cluster
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd pool ls detail -c /var/lib/rook/openshift-storage/openshift-storage.config
pool 1 'ocs-storagecluster-cephblockpool' replicated size 3 min_size 2 crush_rule 25 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 9437 lfor 0/0/30 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.49 application rbd
The crush rules being utilized references hdd while there are only ssds in the cluster
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd crush rule dump -c /var/lib/rook/openshift-storage/openshift-storage.config
{
"rule_id": 25,
"rule_name": "ocs-storagecluster-cephblockpool_host_hdd",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
Resolution
ⓘ For the Ceph commands, use KCS Configuring the Rook-Ceph Toolbox in OpenShift Data Foundation 4.x or KCS Accessing the Red Hat Ceph Storage CLI in OpenShift Data Foundation 4.x to access the Ceph CLI.
ⓘ The following steps involve deletion of crush rules and crush classes. It is recommended to open a support case with Red Hat prior to applying this solution to ensure minimal disruptions to your environment.
- Access Ceph Tools and make note of HDD OSDs:
$ ceph osd df | grep hdd
- Switch to the openshift-storage project and scale down the rook-ceph-operator and ocs-operator:
$ oc project openshift-storage; oc scale deployment rook-ceph-operator ocs-operator --replicas=0
- Set the flags nobackfill, norecover, and norebalance
$ ceph osd set nobackfill
$ ceph osd set norecover
$ ceph osd set norebalance
NOTE: From the step below until the latest step the data will be unavailable for read and write
- Change all pools to utilize an generic Crush Rule which does not reference a device class, hdd or ssd.
In this case we are utilizing the crush rulereplicated_ruleas it doesn't have a device class specified.
NOTE: Do not use the the for loop if you have custom pools with custom crush rules
$ ceph osd crush rule ls
$ for i in $(ceph osd pool ls); do ceph osd pool set $i crush_rule replicated_rule; done
- Modify any OSDs reporting as HDD to SSD:
$ for i in $(ceph osd ls); do echo osd.$i; ceph osd crush rm-device-class osd.$i; ceph osd crush set-device-class ssd osd.$i; done
- Attempt to delete the hdd crush class NOTE: This fails as we have crush rules that reference hdd, make note of that list
$ ceph osd crush class ls
$ ceph osd crush class rm hdd
- Delete the old crush rules one by one that reference hdd (Utilize the rules noted in the previous step)
$ ceph osd crush rule rm <crush_rules_hdd>
- Remove the old crush class
$ ceph osd crush class rm hdd
- Remove any reference of HDD in the StorageCluster CR
$ oc patch storageclusters.ocs.openshift.io ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/defaultCephDeviceClass"}]'
$ oc patch cephclusters.ceph.rook.io ocs-storagecluster-cephcluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/storage/deviceClasses"}]'
Check reference was removed:
$ oc get storageclusters.ocs.openshift.io ocs-storagecluster -n openshift-storage -o yaml |grep -i deviceClass
$ oc get cephclusters.ceph.rook.io ocs-storagecluster-cephcluster -n openshift-storage -o yaml |grep -i deviceClass
- Scale up the rook-ceph-operator and ocs-operator
$ oc scale deployment rook-ceph-operator ocs-operator --replicas=1
- Wait five to ten minutes to allow the operators time to reconcile and create our crush rules
$ sleep 300
- Verify new rules have been created in Ceph that reference the device class ssd and the pools are utilizing these new crush rules
$ ceph osd pool ls detail
$ ceph osd crush rule dump
- Unset the flags nobackfill, norecover, and norebalance
$ ceph osd unset nobackfill
$ ceph osd unset norecover
$ ceph osd unset norebalance
- Verify the health of Ceph
$ ceph status
Root Cause
When upgrading the OpenShift Data Foundations operator to 4.16.x, new crush rules are created for the Ceph pools. These new crush rules are device class specific. The pools are now utilizing a crush rule with a device class that is not being utilized in the cluster, resulting in PGs being in an unknown, misplaced, and/or remapped state
The default behavior of Rook when creating these crush rules is to choose the first item in the list from the command $ ceph osd crush class ls. This causes issues as customers who have transitioned from the unsupported HDDs to SSDs may still have HDDs in their list.
This issue is currently being tracked through a Jira.
Artifacts
| Product/Version | Related BZ/Jira | Errata | Fixed Version |
|---|---|---|---|
| ODF/4.19 | Jira This content is not included.DFBUGS-948 | Errata N/A | 4.19.0 |
| ODF/4.18 | Jira This content is not included.DFBUGS-1666 | Errata N/A | 4.18.1 |
| ODF/4.17 | Jira This content is not included.DFBUGS-1667 | Errata RHSA-2025:17145 | 4.17.14 |
| ODF/4.16 | Jira This content is not included.DFBUGS-1668 | Errata RHBA-2025:17157 | 4.16.16 |
Diagnostic Steps
- The Ceph command
$ ceph osd crush class lshas two items in the list, with ssd not being the first in the list
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd crush class ls -c /var/lib/rook/openshift-storage/openshift-storage.config
[
"hdd",
"ssd"
]
- The storagecluster CR has hdd as the
defaultCephDeviceClass
$ oc get storagecluster -o json | jq '.items[].status.defaultCephDeviceClass'
"hdd"
- The CephCluster CR has the device class hdd and ssd in the
deviceClassessection. Notably hdd is listed first
$ oc get cephcluster -o json | jq '.items[].status.storage.deviceClasses
[
{
"name": "hdd"
"name": "ssd"
}
]
- In case no new _ssd rules are created by the rook-ceph-operator check logs of rook-ceph-operator pod :
# oc logs rook-ceph-operator-xxxxx
2026-02-20T15:44:34.033966152Z 2026-02-20 15:44:34.033887 I | cephclient: creating a new crush rule for changed deviceClass ("default"-->"hdd") on crush rule "replicated_rule"
2026-02-20T15:44:34.033966152Z 2026-02-20 15:44:34.033920 I | cephclient: updating pool "ocs-storagecluster-cephblockpool" failure domain from "host" to "host" with new crush rule "ocs-storagecluster-cephblockpool_host_hdd"
2026-02-20T15:44:34.033966152Z 2026-02-20 15:44:34.033930 I | cephclient: crush rule "replicated_rule" will no longer be used by pool "ocs-storagecluster-cephblockpool"
2026-02-20T15:44:34.495930182Z 2026-02-20 15:44:34.495871 E | ceph-block-pool-controller: failed to reconcile CephBlockPool "openshift-storage/ocs-storagecluster-cephblockpool". failed to create pool "ocs-storagecluster-cephblockpool".: failed to configure pool "ocs-storagecluster-cephblockpool".: failed to configure pool "ocs-storagecluster-cephblockpool": failed to update crush rule for pool "ocs-storagecluster-cephblockpool": failed to create replicated crush rule "ocs-storagecluster-cephblockpool_host_hdd": failed to create crush rule ocs-storagecluster-cephblockpool_host_hdd. . Error EINVAL: device class hdd does not exist: exit status 22
On this example no new _ssd rules are created and reports. In that case , repeat steps to scale down operators and complete step 9 to remove any reference of HDD in the StorageCluster
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.