Inconsistent zonegroup/zone state in Rados GW after upgrade of multizone site to Ceph 2
Environment
- Red Hat Ceph Storage 2
Issue
After upgrading to Ceph 2, Rados GW is sending HTTP error 303 to clients, and the following error can be seen in the logs:
"NOTICE: request for data in a different zonegroup (s3 != be495af3-3378-4592-99cc-41685598a23e)"
Resolution
The migration needs to be done manually. Please contact Red Hat support if you need assistance with these steps or if there are any questions. In the following example we have a single zone "s3-west" in a zonegroup "s3". The pools being used for rgw_region_root_pool and rgw_zone_root_pool are .s3.rgw.root and .s3-west.rgw.root.
- get the zone json :
radosgw-admin zone get --rgw-zone s3-west > s3-west.json
Edit the resulting file and clear the realm-id field (set it to "")
- fetch the zonegroup s3 json output. This need to be copied from the
radosgw-admin period getoutput:
{
"id": "be495af3-3378-4592-99cc-41685598a23e",
"name": "s3",
"api_name": "s3",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "s3-west",
"zones": [
{
"id": "s3-west",
"name": "s3-west",
"endpoints": [],
"log_meta": "true",
"log_data": "false",
"bucket_index_max_shards": 0,
"read_only": "false"
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "7ba2ebe6-473d-42f9-b62a-3a270078f73c"
}
Now clear the realm_id and change the zonegroup id to s3 (very important). Store the result as s3.json.
- backup
.rgw.root,.s3.rgw.root,.s3-west.rgw.root(usingrados cppoolcommand)
# ceph osd pool create <new-pool> 8 8 replicated
# rados cppool <old-pool> <new-pool>
-
stop all Rados GW instances on all nodes
-
delete
.rgw.root,.s3.rgw.root,.s3-west.rgw.root(please make sure to have a valid backup from step 3) before dropping the pools)
# ceph osd pool delete <pool-name> <pool-name> --yes-i-really-really-mean-it
-
Add
rgw_zonegroup_root_pool=.s3.rgw.rootto yourceph.confconfiguration. If you want the realm and the period in the same pool you can addrgw_period_root_poolandrgw_realm_root_poolparameters with the same value, too. -
absolutely make sure to not start any Rados GW instance or run any radosgw-admin commands during this fix
-
create a new realm
# radosgw-admin realm create --rgw-realm <name> --default
- apply the modified zone group from step 2)
# radosgw-admin zonegroup set --rgw-zonegroup s3 < s3.json
-
make sure that the zonegroup "s3" was created correctly and the
idandnamefields are set to "s3" and therealm-idis the new realm id from the output in step 8) -
set the "s3" zonegroup as the default zonegroup
# radosgw-admin zonegroup default --rgw-zonegroup s3
- add the "s3-west" zone from step 1) to the new "s3" zonegroup:
# radosgw-admin zone set --rgw-zone s3-west --rgw-zonegroup s3 < zone.json
- update the period with the changes
# radosgw-admin period update --commit
-
verify that the configuration is correct
-
start the Rados GW instances
Root Cause
The migration of the region and zone information did fail, as different pools are being used for rgw_region_root_pool and rgw_zone_root_pool.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.