Ceph deployed by OpenStack Director has "min_size = 1" configured in pools
Environment
- Red Hat OpenStack Platform 8
- Red Hat OpenStack Platform 9
- Red Hat OpenStack Platform 10
Issue
-
A replicated Ceph OSD pool with min_size = 1 allows an object continue serving I/O when it has only 1 replica which could lead to data loss/data split-brain(incomplete PGs)/Unfound objects.
-
With default templates in OSP Director, pools will have "min_size = 1" configured as a default value.
Resolution
-
osd_pool_default_min_sizesets themin_sizevalue of a replicated pool when creating without specifyingmin_size.min_sizesets the minimum number of replicas required for I/O for replicated pools. -
With
osd_pool_default_min_size = 1configured in ceph.conf, pools created for Cinder, Glance and nova will havemin_size = 1configured. -
Currently there is an open This content is not included.Feature Bugzilla #1404459 on this issue.
-
It is suggested that removing osd_pool_default_min_size from the Director template and Ceph will take care of this value as it's default, a recommended value.
-
If overcloud is not deployed, please edit file
/usr/share/openstack-tripleo-heat-templates/puppet/hieradata/ceph.yamland make the following change, -
Locate the following line, and remove it completely. Ceph will take care of the configuration as default which is a recommended value.
ceph::profile::params::osd_pool_default_min_size: 1
-
Then continue overcloud deployment.
-
If overcloud has already been deployed and has"osd_pool_default_min_size = 1" in ceph.conf. Remove this option from ceph.conf from all the nodes.
-
List the pools
ceph osd lspools
- Check the "min_size" in your existing pools,
ceph osd pool get {pool-name} min_size
- Check the "size" in your existing pools, size defines total number of replicas for an object in a pool
ceph osd pool get {pool-name} size
- It is recommended that min_size = "size - (size / 2)" ,where "size" is taken from "osd_pool_default_size" with 3 as its default value. For a pool which is 3 in size, the min_size is 2. Change min_size for all the pools that has min_size set to 1 in previous step
ceph osd pool set {pool-name} min_size <size - (size / 2)> always upper bound is taken.
- Please run below commands from any of the Ceph monitor nodes, because default is 0 and when we have this option as 0, min_size will be calculated with formula "size - (size / 2)". This will help if new pool creation with correct min_size.
# ceph tell mon.\* injectargs "--osd_pool_default_min_size 0"
# ceph tell osd.\* injectargs "--osd_pool_default_min_size 0"
Root Cause
-
Default line "ceph::profile::params::osd_pool_default_min_size: 1" in /usr/share/openstack-tripleo-heat-templates/puppet/hieradata/ceph.yaml sets the "osd_pool_default_min_size = 1" in ceph.conf which is not recommended from Ceph perspective.
-
In Ceph we recommend users to use min_size=2 which prevents any kind of data loss/incomplete PGs/Unfound objects if 2 or more failure domain(default: host) goes down. As with min_size=2 we pause writes to the Ceph pools and with
min_size=1we allow writes with only 1 failure domain up. Because Ceph stops the write IO to the pool when min_size goes less than 2. This makes sure that we have two consistent copies always available.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.