Upgrading RHCS 5 hosts from RHEL 8 to RHEL 9 removes ceph-common package. Services fail to start.
Environment
- Red Hat Ceph Storage (RHCS) 5
- Red Hat Ceph Storage (RHCS) 6
- Red Hat Ceph Storage (RHCS) 7
- Red Hat Enterprise Linux (RHEL) 8
- Red Hat Enterprise Linux (RHEL) 9
Issue
- Ceph services fail to start automatically after rebooting ceph nodes upgraded with
leappfrom RHEL 8 to RHEL 9. - After the upgrade the
/etc/cephdirectory is missing and keys need to be regenerated.
Resolution
To resolve the issue of ceph services not starting after the upgrade one of the following options can be taken:
-
Wait until the manager checks the host (happens periodically, normally every 10 minutes) for new disk drives, when this ceph orchestration task is run and does not find a
/var/log/ceph/<fsid>directory then it will be re-created with the correct permissions. -
Create the directory manually with the correct permissions set. The
<fsid>can be found out executingsudo ceph fsidon the management node.# sudo mkdir -p /var/log/ceph/<fsid> # sudo chmod 3770 /var/log/ceph # sudo chmod 0770 /var/log/ceph/<fsid> # sudo chown -R ceph:ceph /var/log/ceph
After the directory is in place, install the ceph-common package manually.
ⓘ If the services still have problems starting you might need to reset the systemd failed counter first using
systemctl reset-failed <service-name>and then try to start it again.
ⓘ Installing the
ceph-commonpackage only will not help. The/var/log/cephdirectory is created, but thefs-iddirectory inside it will still be missing.
To avoid this issue before the upgrade we need to configure LEAPP to not remove libunwind:
# echo libunwind | sudo tee -a /etc/leapp/transaction/to_keep
NOTE: Despite adding libunwind to the to_keep file in the step above, preupgrade will still report that libunwind will be removed during upgrade. However, if it is included in the to_keep file it will not in fact be removed.
Root Cause
The problem occurs because the LEAPP upgrade process removes the libunwind package, which is slated for removal from RHEL 9. The ceph-common package depends on libunwind, therefore it will be uninstalled as well. When the ceph-common package is removed it removes the /var/log/ceph directory. When podman tries to start the ceph containers, it can not mount /var/log/ceph/<fsid> into the container and fails with an error.
LEAPP can remove the ceph-common package also when the red hat ceph tools repository has not been enabled as a custom repository when running the LEAPP command. The documentation has been updated to reflect that you need to enable this repository.
Artifacts
| Product/Version | Related BZ/Jira | Errata | Fixed Version |
|---|---|---|---|
| RHCS/7 | Bugzilla This content is not included.2263195 | Errata TBD | 7.1z4 - 7.1.4 |
| RHEL/8-9 | Jira This content is not included.RHEL-34526 | Errata TBD | TBD |
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.