Ceph - Adding OSD's with initial CRUSH weight of 0 causes 'ceph df' output to report invalid MAX AVAIL on pools
Environment
- Red Hat Ceph Storage 1.3.x
- Red Hat Enterprise Linux 7.x
Issue
-
When adding OSD's to a Ceph cluster using
'ceph-deploy'with OSD Crush Initial Weight set to 0, the output of'ceph df'reports 'MAX AVAIL' to be 0 instead of the proper numerical value for all Ceph pools. This causes problems for OpenStack Cinder because it thinks there isn't any available space to provision new volumes. -
Before adding an OSD:
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
589T 345T 243T 41.32
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
data 0 816M 0 102210G 376
metadata 1 120M 0 102210G 94
images 5 11990G 1.99 68140G 1536075
volumes 6 63603G 10.54 68140G 16462022
instances 8 5657G 0.94 68140G 1063602
rbench 12 260M 0 68140G 22569
scratch 13 40960 0 68140G 10
- After adding OSD:
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
590T 346T 243T 41.24
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
data 0 816M 0 0 376
metadata 1 120M 0 0 94
images 5 11990G 1.98 0 1536075
volumes 6 63603G 10.52 0 16462022
instances 8 5657G 0.94 0 1063602
rbench 12 260M 0 0 22569
scratch 13 40960 0 0 10
- Max Avail is showing 0's for all pools.
Resolution
- Set the CRUSH weight of all the new OSD's to a positive value above 0 such as 0.01 and 'ceph df' will report the proper numerical values for each pool as expected. This value should be incremented slowly until reaching the full weight of the OSD to reduce I/O impact from background re-balanceing of data to the new OSD's.
ceph@admin$ ceph osd crush reweight osd.<num> <weight>
- New OSD's would be added to a cluster with a weight of 0 to prevent a massive re-balance of data when weighted in with a full CRUSH weight value for the OSD. When the weight is set to 0 it can be slowly incremented to control the amount of data being re-balanced by Ceph to the new OSD's to prevent client I/O issues.
Root Cause
- A Red Hat Bugzilla has been open to address this bug as it has been seen in both Red Hat and Upstream releases of Ceph.
- Tracker for this bug can be found Content from tracker.ceph.com is not included.on Ceph Tracker
Diagnostic Steps
- Steps to Reproduce:
2 separate networks cluster and public
1 admin deploy node
3 Mon nodes
3 OSD nodes - 3 OSD disk 3 separate journals
Running RHCS 1.3.1 and RHEL 7.2
- Edit ceph conf to include:
osd_crush_initial_weight = 0
-
Check 'ceph df' - verify max avail shows expected value.
-
Add 1 additional OSD node with 3 OSD's and 3 separate journals taking the initial OSD CRUSH weight of 0 from the conf
-
Check 'ceph df' - verify issue is seen with max avail at 0
-
If issue is seen, change OSD CRUSH weight to 0.01 and recheck that max avail shows proper numerical values
ceph@admin$ ceph osd crush reweight osd.<num> 0.01
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.