Ceph - Adding OSD's with initial CRUSH weight of 0 causes 'ceph df' output to report invalid MAX AVAIL on pools

Solution Verified - Updated 2 Aug 2024

Environment

Red Hat Ceph Storage 1.3.x
Red Hat Enterprise Linux 7.x

Issue

When adding OSD's to a Ceph cluster using 'ceph-deploy' with OSD Crush Initial Weight set to 0, the output of 'ceph df' reports 'MAX AVAIL' to be 0 instead of the proper numerical value for all Ceph pools. This causes problems for OpenStack Cinder because it thinks there isn't any available space to provision new volumes.
Before adding an OSD:

GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED
    589T      345T         243T         41.32
POOLS:
    NAME          ID     USED       %USED     MAX AVAIL     OBJECTS
    data          0        816M         0       102210G          376
    metadata      1        120M         0       102210G           94
    images        5      11990G      1.99        68140G      1536075
    volumes       6      63603G     10.54        68140G     16462022
    instances     8       5657G      0.94        68140G      1063602
    rbench        12       260M         0        68140G        22569
    scratch       13      40960         0        68140G           10

After adding OSD:

GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED
    590T      346T         243T         41.24
POOLS:
    NAME          ID     USED       %USED     MAX AVAIL     OBJECTS
    data          0        816M         0             0          376
    metadata      1        120M         0             0           94
    images        5      11990G      1.98             0      1536075
    volumes       6      63603G     10.52             0     16462022
    instances     8       5657G      0.94             0      1063602
    rbench        12       260M         0             0        22569
    scratch       13      40960         0             0           10

Max Avail is showing 0's for all pools.

Resolution

Set the CRUSH weight of all the new OSD's to a positive value above 0 such as 0.01 and 'ceph df' will report the proper numerical values for each pool as expected. This value should be incremented slowly until reaching the full weight of the OSD to reduce I/O impact from background re-balanceing of data to the new OSD's.

ceph@admin$ ceph osd crush reweight osd.<num> <weight>

New OSD's would be added to a cluster with a weight of 0 to prevent a massive re-balance of data when weighted in with a full CRUSH weight value for the OSD. When the weight is set to 0 it can be slowly incremented to control the amount of data being re-balanced by Ceph to the new OSD's to prevent client I/O issues.

Root Cause

A Red Hat Bugzilla has been open to address this bug as it has been seen in both Red Hat and Upstream releases of Ceph.
Tracker for this bug can be found Content from tracker.ceph.com is not included.on Ceph Tracker

Diagnostic Steps

Steps to Reproduce:

2 separate networks cluster and public
1 admin deploy node
3 Mon nodes
3 OSD nodes - 3 OSD disk 3 separate journals
Running RHCS 1.3.1 and RHEL 7.2

Edit ceph conf to include:

osd_crush_initial_weight = 0

Check 'ceph df' - verify max avail shows expected value.
Add 1 additional OSD node with 3 OSD's and 3 separate journals taking the initial OSD CRUSH weight of 0 from the conf
Check 'ceph df' - verify issue is seen with max avail at 0
If issue is seen, change OSD CRUSH weight to 0.01 and recheck that max avail shows proper numerical values

ceph@admin$ ceph osd crush reweight osd.<num> 0.01

SBR

Ceph

Product(s)

Red Hat Ceph Storage

Category

Troubleshoot

Tags

Ceph

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.