Ceph Scrubbing And Its Parameters

Updated

Ceph OSDs are responsible for storing, retrieving, protecting and checking the coherence of the data stored in the Reliable Autonomic Distributed Object Store. In order to check the coherence of the data stored, the different copies of each object must be periodically checked in order to verify that all copies are identical. The verification process is driven by the primary OSD assigned to the PG and is known as scrubbing.

Scrubbing types

  1. Light Scrubbing
    The light scrubbing consists in checking, for each object within a Placement Group (PG), that each copy stored across the OSDs protecting the PG has the same size and the same digest.
  2. Deep Scrubbing
    The deep scrubbing consists in physically reading each object within a Placement Group (PG) on all the OSDs protecting the PG in order to recalculate the digest and compare the freshly recalculated value between all OSDs protecting the PG. Note that the entire PG is physically read during the deep-scrubbing operations ensuring that the data that was once written can be read again (protection again write failures left undetected or corruption) and that the data re-read is identical on all OSDs protecting the PG.

Automated scrubbing
Each PG will automatically light scrubbed. The parameters influencing the light scrubbing process are:

  • osd_scrub_min_interval Minimum interval between two light scrubs when load is low. Defaults to 1 day.
  • osd_scrub_max_interval Maximum interval between two light scrubs irrespective of load. Defaults to a week.
  • osd_scrub_load_threshold Light scrubbing occurs if system load is below this value. Defaults to 50% (0.5).
  • osd_deep_scrub_interval Interval between two deep scrubs irrespective of load. Defaults to a week.
  • osd_max_scrubs Maximum number of concurrent PG scrub concurrent operations per OSD. Defaults to 1.
  • osd_scrub_sleep Delay between two light or deep scrub operations. Defaults to 0.0.

It is possible to disable automated light and deep scrubbing operations an prefer a manual scehduling mechanism in order to better control the exact time those particular operations take place.

  • ceph osd set noscrub Disable automated light scrubbing
  • ceph osd set nodeep-scrub Disable automated light scrubbing

These special flags are shown by the ceph -s command in order to remind the cluster administrator of this particular situation.

    cluster 8c5d3515-4dab-436d-92b8-267bd4f1185c
     health HEALTH_WARN noscrub,nodeep-scrub flag(s) set
     monmap e1: 1 mons at {daisy=192.168.122.114:6789/0}, election epoch 1, quorum 0 daisy
     osdmap e22: 3 osds: 3 up, 3 in
            flags noscrub,nodeep-scrub
      pgmap v151: 192 pgs, 3 pools, 0 bytes data, 0 objects
            102 MB used, 27512 MB / 27614 MB avail
                 192 active+clean

Manual scrubbing
It is possible to perform a manual scrub operation at any time. You can manually trigger a scrub operation using the following commands.

  • ceph pg scrub {pgid} Trigger a one shot light scrubbing operation on the specified PG.
  • ceph pg deep-scrub {pgid} Trigger a one shot deep scrubbing operation on the specified PG.
  • ceph osd scrub {osdid} Trigger a one shot light scrubbing operation of all the PGs managed by the specified OSD.
  • ceph osd deep-scrub {osdid} Trigger a deep scrubbing operation of all PGs managed by the specified OSD.

Extra parameters for controlling scrubbing

  • osd_disk_threads Number of disk threads available at run time. Defaults to 1.
  • osd_disk_thread_ioprio_class CFQ class assigned to each disk thread. Defaults to None ("").
  • osd_disk_thread_ioprio_priority Priority within the CFQ class assigned to each disk thread. Defaults to None (-1).
  • osd_deep_scrub_stride Read size used during deep scrubbing operations. Defaults to 524288 (512KB).

Appropriate values for osd_disk_thread_ioprio_class are : "be" for Best Effort, "rt" for Real Time and "idle"
Appropriate values for osd_disk_thread_ioprio_priority are : An integer between 0 (highest) and 7 (lowest)
The above two (2) parameters require the OSD devices to use the CFQ disk elevator.

Scheduled scrubbing operations via cron
If you have decided to go for a home scheduling of the scrubbing process, do not hesitate to read the following article for details about its implementation.

Checking a device disk elevator
The following command "cat /sys/block/{OSDDevice}/queue/scheduler" will display on stdout the disk elevator used by a particular device.

$ sudo cat /sys/block/sdb/queue/scheduler
noop [deadline] cfq 

The value displayed between [] indicates the disk elevator configured for the device. In the aboce example, deadline.

Article Type