What does the error message "feature set mismatch missing 1000000000000" in Ceph logs mean?
Environment
- Red Hat Ceph Storage 1.2
- Red Hat Ceph Storage 1.2.3
- Red Hat Ceph Storage 1.3
- Inktank Ceph Enterprise 1.2
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 7
Issue
- What does the following messages in Ceph client logs mean?
connect protocol feature mismatch, my 2fffffffffff < peer 12fffffffffff missing 1000000000000
Resolution
-
These sort of messages get logged on a client when there is a change in the configuration (possibly a change to ceph tunables) or an upgrade to a single node to a higher version of Ceph.
-
It is recommended and best practice that all nodes/clients in the cluster are running the same version of Ceph. Check if each cluster node (MONs/OSDs) are running the same version of Ceph, including the client nodes.
-
Example below shows admin node and client node running two separate versions of Ceph.
[root@ceph-box] # rpm -qa | grep ceph
ceph-deploy-1.5.25-1.el7cp.noarch
ceph-common-0.94.1-16.el7cp.x86_64
[root@openstack-admin]# rpm -qa | grep ceph
ceph-common-0.80.8-15.el7cp.x86_64
- Check the tunable profile set on the cluster, and if the Ceph version and Kernel version on the client that logs this message support this tunable profile. The profile below is hammer which defaults to STRAW2 algorithm and tuning or Optimal.
[root@ceph-box] # ceph osd crush show-tunables
<partial output>
...
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"straw_calc_version": 1,
"allowed_bucket_algs": 54,
**"profile": "hammer",**
**"optimal_tunables": 1,**
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
-
Validate the hashing algorithm thehost bucket is using by viewing the CRUSH map.
-
The STRAW2 Hashing algorithm is only supported on RHCS 1.3 (Hammer) and Kernel version 3.15 (or built in support for RHEL 7 with kernel-3.10.0-327.el7)
-
To fetch the CRUSH map:
[root@ceph-box] # ceph osd getcrushmap -o {compiled-crushmap-filename} ## this will gather the compiled CRUSH map
[root@ceph-box] # crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename} ## decompile the map to make readable
- The following example shows a host bucket in CRUSH map running STRAW2:
<partial output>
...
buckets
host hostname {
id -2 # do not change unnecessarily
# weight 3.000
**alg straw2**
hash 0 # rjenkins1
item osd.0 weight 1.000
item osd.1 weight 1.000
item osd.2 weight 1.000
- If it is not possible to upgrade the Ceph version to support tunables profile or hashing (STRAW) method selected then you can modify the OSD host bucket in the CRUSH map to use STRAW hashing method instead of the new STRAW 2 hashing method and then re-inject the CRUSH map to the cluster.
Please note that injecting a new CRUSH map into the cluster may cause a good amount of data to be migrated in the cluster and care should be taken when making this change during peak workload periods on the cluster!
[root@ceph-box] # ceph osd getcrushmap -o {compiled-crushmap-filename} ## this will gather the compiled CRUSH map
[root@ceph-box] # crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename} ## this will decompile the map to make readable
-
Make the change to the 'alg' section of the Host bucket from STRAW 2 to STRAW and save and inject the new CRUSH map.
-
The following example shows host bucket in CRUSH map edited with STRAW hashing method:
<partial output>
...
buckets
host hostname {
id -2 # do not change unnecessarily
# weight 3.000
**alg straw**
hash 0 # rjenkins1
item osd.0 weight 1.000
item osd.1 weight 1.000
item osd.2 weight 1.000
[root@ceph-box] # crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename} ## this will recompile the CRUSH map
[root@ceph-box] # ceph osd setcrushmap -i {compiled-crushmap-filename} ## this will inject the new compiled CRUSH map
- The OSD host should now be using a supported hashing algorithm for you version of Ceph.
Root Cause
- The error 'feature set mismatch missing 1000000000000' means that the kernel version on the client does not support the requested Ceph feature. The feature is 1000000000000 which is CEPH_FEATURE_CRUSH_V4 **** or the STRAW2 hashing algorithm.
- The STRAW2 hashing algorithm is only supported on clients running kernel version 3.15 and above or RHEL 7 (kernel-3.10.0-327.el7) and above.
- The required Ceph libraries versions to support STRAW2 algorithm are in Ceph version 1.3 (hammer), any versions below this support.
- Any new host bucket added to CRUSH map while running Hammer profile will be added using STRAW2 hashing algorithm as STRAW2 is default for Hammer.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.