How to prevent a Ceph cluster from automatically replicating data to other OSDs, while removing one of the OSDs manually.

Solution Verified - Updated

Environment

  • Red Hat Ceph Enterprise 1.3
  • Red Hat Ceph Enterprise 1.2.3

Issue

  • An OSD disk was found faulty and needs to be removed from the Ceph cluster, to be replaced by a new disk.

  • How can I prevent a Ceph cluster from automatically replicating data to other OSDs, while removing one of the OSDs manually?

  • While removing the problematic disk, the Ceph cluster will automatically create replicas of the objects onto other OSDs depending on the value set for 'osd_pool_default_size' in /etc/ceph/ceph.conf. This creates additional traffic/IO which is better avoided, since a new OSD is set to be added and the replica is better done to the new disk.

  • In such a case, how can the automatic replication be stopped?

Resolution

  • When doing hardware maintenance on the OSD nodes, it can be useful to prevent Ceph from automatically recovering data with:
# ceph osd set noout
# ceph osd set norecover
# ceph osd set nobackfill
# ceph osd set norebalance  (From hammer)
  • To revert the setting after maintenance, run:
# ceph osd unset noout
# ceph osd unset norecover
# ceph osd unset nobackfill
# ceph osd uset norebalance  (From hammer)
  • Setting 'noout' on the Ceph cluster will prevent OSDs from automatically being marked out even after the default 5 minutes has elapsed of them being down. This measure prevents data movement normally associated with the 'out' status. However, when the OSD is removed from crush as part of the replacement process, data will move anyway unless the 'nobackfill' and 'norecover' flags are also set.
  • This is particularly useful if an OSD (nearing death) needs to be removed/replaced, but don't want CRUSH to rebalance data when it is pulled out.
  • The suggested steps would be to set the 'noout', 'norecover', and 'nobackfill' flags, stop the OSD, replace the disk, create the OSD on the disk, and remove the flags set in the initial step. This should make CRUSH rebalance the data only after the new OSD is added and the flags are removed.
SBR
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.