Ceph - Adding OSD's with initial CRUSH weight of 0 causes 'ceph df' output to report invalid MAX AVAIL on pools

Solution Verified - Updated

Environment

  • Red Hat Ceph Storage 1.3.x
  • Red Hat Enterprise Linux 7.x

Issue

  • When adding OSD's to a Ceph cluster using 'ceph-deploy' with OSD Crush Initial Weight set to 0, the output of 'ceph df' reports 'MAX AVAIL' to be 0 instead of the proper numerical value for all Ceph pools. This causes problems for OpenStack Cinder because it thinks there isn't any available space to provision new volumes.

  • Before adding an OSD:

GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED
    589T      345T         243T         41.32
POOLS:
    NAME          ID     USED       %USED     MAX AVAIL     OBJECTS
    data          0        816M         0       102210G          376
    metadata      1        120M         0       102210G           94
    images        5      11990G      1.99        68140G      1536075
    volumes       6      63603G     10.54        68140G     16462022
    instances     8       5657G      0.94        68140G      1063602
    rbench        12       260M         0        68140G        22569
    scratch       13      40960         0        68140G           10
  • After adding OSD:
GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED
    590T      346T         243T         41.24
POOLS:
    NAME          ID     USED       %USED     MAX AVAIL     OBJECTS
    data          0        816M         0             0          376
    metadata      1        120M         0             0           94
    images        5      11990G      1.98             0      1536075
    volumes       6      63603G     10.52             0     16462022
    instances     8       5657G      0.94             0      1063602
    rbench        12       260M         0             0        22569
    scratch       13      40960         0             0           10
  • Max Avail is showing 0's for all pools.

Resolution

  • Set the CRUSH weight of all the new OSD's to a positive value above 0 such as 0.01 and 'ceph df' will report the proper numerical values for each pool as expected. This value should be incremented slowly until reaching the full weight of the OSD to reduce I/O impact from background re-balanceing of data to the new OSD's.
ceph@admin$ ceph osd crush reweight osd.<num> <weight>
  • New OSD's would be added to a cluster with a weight of 0 to prevent a massive re-balance of data when weighted in with a full CRUSH weight value for the OSD. When the weight is set to 0 it can be slowly incremented to control the amount of data being re-balanced by Ceph to the new OSD's to prevent client I/O issues.

Root Cause

Diagnostic Steps

  • Steps to Reproduce:

2 separate networks cluster and public
1 admin deploy node
3 Mon nodes
3 OSD nodes - 3 OSD disk 3 separate journals
Running RHCS 1.3.1 and RHEL 7.2

  • Edit ceph conf to include:
osd_crush_initial_weight = 0
  • Check 'ceph df' - verify max avail shows expected value.

  • Add 1 additional OSD node with 3 OSD's and 3 separate journals taking the initial OSD CRUSH weight of 0 from the conf

  • Check 'ceph df' - verify issue is seen with max avail at 0

  • If issue is seen, change OSD CRUSH weight to 0.01 and recheck that max avail shows proper numerical values

ceph@admin$ ceph osd crush reweight osd.<num> 0.01
SBR
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.