How to map a pool to a dedicated set of OSDs?
Environment
-
Red Hat Ceph Storage 1.2.3
-
Red Hat Ceph Storage 1.3
-
Red Hat Ceph Storage 2.0
-
Scenarios where certain pools should be created on SSD OSDs while others on HDDs, for example, Cache tiering.
Issue
-
How to map a pool to a dedicated set of OSDs via the CRUSH map?
-
How to create a cache tier with SSD OSDs?
-
How to create a pool specifically on SSD OSDs for faster data reads/writes, while keeping other pools on HDDs?
Resolution
-
Mapping a specific set of OSDs to a pool is not directly possible. Editing the CRUSH map is required to achieve this.
-
This is documented at Content from docs.ceph.com is not included.Content from docs.ceph.com is not included.http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#placing-different-pools-on-different-osds.
An overview on how to create a custom CRUSH map to use specific OSDs for a pool.
- Get the CRUSH map of the cluster using
# ceph osd getcrushmap -o /tmp/crushmap.bin
- Decompile the CRUSH map, which is downloaded in a binary format by default
# crushtool -d /tmp/crushmap.bin -o /tmp/crushmap.txt
-
Modify the de-compiled CRUSH map as per the requirements and following Content from docs.ceph.com is not included.Content from docs.ceph.com is not included.http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#placing-different-pools-on-different-osds
-
Compile the changed CRUSH map back to binary format
# crushtool -c /tmp/crushmap.txt -o /tmp/new_crushmap.bin
- Apply/Upload the changed CRUSH map to the Ceph cluster.
# ceph osd setcrushmap -i /tmp/new_crushmap.bin
- Create a pool
# ceph osd pool create <pool_name> <pg_num> <pgp_num>
- Assign the new CRUSH ruleset created in step 1.
# ceph osd pool set <pool_name> crush_ruleset <crush_ruleset_number>
Details on how to customize the CRUSH map for specific rule sets.
-
The CRUSH map contains a list of all the OSD devices, as well as a list of all the buckets in the Ceph cluster. A bucket is a destination map which consists the detail of each region, data center, room, rack, server, etc..
-
When an OSD is deployed and started, it will be added to the list of the devices automatically in the CRUSH map. Make sure that a bucket is created for the devices you want to dedicate, as as many buckets as possible for failure domains.
host ceph-osd-ssd-server-1 { <------ This bucket will contain all the SSD based OSDs you have deployed on this server. Create as many bucket as you
id -1 <------ have servers containing SSDs
alg straw
hash 0
item osd.0 weight 1.00
}
- Make sure to create a bucket for the devices to be dedicated for a pool. The bucket type can vary (root level, rack level, host level).
root ssd { <-- This bucket will contain all the SSD server buckets
id -6
alg straw
hash 0
item ceph-osd-ssd-server-1 weight 2.00
... (add as many lines as created)
}
- Create a rule that can be used for the pool placement strategy
rule choose-ssd {
ruleset {x}
type replicated
min_size 0
max_size 4
step take ssd <- Must point to the bucket created above
step chooseleaf firstn 0 type host
step emit
}
- If using customize CRUSH map disable updating the crushmap on start of the daemon:
[osd]
osd crush update on start = false
- Additional article Ceph - Modify CRUSH map for different Perfromance Domains
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.