How can I test the impact CRUSH map tunable modifications will have on my PG distribution across OSDs in Red Hat Ceph Storage?
Environment
- Red Hat Ceph Storage
Issue
- How can I test the impact CRUSH map tunable modifications will have on my PG distribution across OSDs?
- How do I use the
osdmaptoolto test?
Resolution
1. Dump the existing CRUSH map from the cluster with:
# ceph osd getmap -o osdmap
got map from osdmap epoch 739055
2. Get the existing pg mapping from the osdmap:
# osdmaptool osdmap --test-map-pgs > map_output
- This command can also be used with option [--pool poolid]
--test-map-pgs [--pool poolid]
- Will print out the mappings from placement groups to OSDs.
--test-map-pgs-dump [--pool poolid]
- Will print out the summary of all placement groups and the mapping from them to the mapped OSDs.
3. Export the CRUSH map from the osdmap and edit it by hand:
# osdmaptool --export-crush crushmap osdmap
osdmaptool: osdmap file 'osdmap'
osdmaptool: exported crush map to crushmap
4. Decompile the CRUSH map for editing:
# crushtool -d crushmap -o crushmap_decompiled
5. Once decompiled the CRUSH map can be opened in a text editor such as vim. Make sure to save the newly edited CRUSH map with a new name such as crushmap_decompiled_edited.
# vim crushmap_decompiled
The CRUSH tunables will need to be modified, if you are interested in modifying them to the optimal values for Red Hat Ceph Storage 1.3, append the following values to the top of the CRUSH map:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
Note: The tunable choose_leaf_vary_r optimal value is 1, however a value of 4 or 5 is recommended on clusters with large amounts of data as the data movement for these values will be much less.
6. Once edits have been made recompile the CRUSH map with:
# crushtool -c crushmap_decompiled_edited -o crushmap_edited
7. Inject the newly edited CRUSH map into the existing osdmap with the osdmaptool:
# osdmaptool osdmap --import-crush crushmap_edited
osdmaptool: osdmap file 'osdmap'
osdmaptool: imported 17742 byte crush map from crushmap_edited
osdmaptool: writing epoch 739055 to osdmap
8. Now that the osdmap has the newly edited crushmap embedded in it. The pg placement can be tested with:
# osdmaptool osdmap --test-map-pgs > editedmap_output
- This command can also be used with option [--pool poolid]
--test-map-pgs [--pool poolid]
- Will print out the mappings from placement groups to OSDs.
--test-map-pgs-dump [--pool poolid]
- Will print out the summary of all placement groups and the mapping from them to the mapped OSDs.
9. Use diff -y to observe the changes between the test-map-pgs output:
# diff -y editmap_ouput map_output
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.