HAProxy does not restart during RHOSP upgrade when RHCS is director-deployed and Alertmanager is enabled
Environment
- Red Hat Openstack Platform (RHOSP) 17.1
- Red Hat Ceph Storage (RHCS) 5
- Red Hat Ceph Storage (RHCS) 6
- Red Hat Ceph Storage (RHCS) 7
- Red Hat Ceph Storage (RHCS) 8
Issue
-
If you deployed Alertmanager in a director-deployed Red Hat Ceph Storage environment, the upgrade from Red Hat Ceph Storage version 4 to version 5 fails. The failure occurs because HAProxy does not restart after you run the following command to configure
cephadmon the Red Hat Ceph Storage nodes:$ openstack overcloud external-upgrade run \ --skip-tags ceph_ansible_remote_tmp \ --stack <stack> \ --tags cephadm_adopt 2>&1After you run the command, the Red Hat Ceph Storage cluster status is
HEALTH_WARN.
Resolution
Apply the following workaround to address the issue.
-
Log in to a Controller node and create the following
alertmanager_specfile :$ cat <<EOF>alertmanager_spec --- service_type: alertmanager service_id: alertmanager service_name: alertmanager placement: count: 3 label: monitoring networks: - {storage_network} EOFReplace
{storage_network}with the subnet assigned to the storage network. -
as root user from the same Controller, run the
cephadmshell. Remove the adopted Alertmanager daemons and apply the spec created in step 1:$ cephadm shell -m alertmanager_spec $ ceph orch rm alertmanager $ ceph orch apply -i /mnt/alertmanager_spec -
verify for running alertmanager instances using
ceph orch psand exit the cephadm shell$ ceph orch ps -
finally, restart haproxy bundle:
pcs resource enable haproxy-bundle
See also Reinitializing Ceph Alertmanager and/or RGW - Upgrading Ceph
Root Cause
The HAProxy bundle cannot start through Pacemaker because a failure occurs when it attempts to bind to the Alertmanager port (9093). The failure occurs because Alertmanager has not been redeployed on the storage network.
See also Bugzilla 7041333: Cephadm attempts to bind Grafana to all (::) interfaces when IPv6 networks list is provided.
Artifacts
| Product/Version | Related BZ/Jira | Errata | Fixed Version |
|---|---|---|---|
| RHCS/8 | Bugzilla This content is not included.2274719 | Errata RHSA-2025:9775 | 8.1 - 8.1 |
| RHCS/8 | Bugzilla This content is not included.2356569 | Errata TBD | 8.0z4 - 8.0.4 |
| RHCS/7 | Bugzilla This content is not included.2356570 | Errata TBD | 7.1z7 - 7.1.7 |
| RHCS/6 | Bugzilla This content is not included.2356571 | Errata TBD | 6.1z10 - 6.1.10 |
| RHCS/8 | Bugzilla This content is not included.2356355 | Errata RHSA-2025:9775 | 8.1 - 8.1 |
| RHCS/8 | Bugzilla This content is not included.2356550 | Errata TBD | 8.0z4 - 8.0.4 |
| RHCS/7 | Bugzilla This content is not included.2356551 | Errata TBD | 7.1z7 - 7.1.7 |
| RHCS/6 | Bugzilla This content is not included.2356553 | Errata TBD | 6.1z10 - 6.1.10 |
| RHCS/5 | Bugzilla This content is not included.2356354 | Errata TBD | 5.3z9 - 5.3.9 |
| RHCS/5 | Bugzilla This content is not included.2269009 | Errata RHBA-2025:1478 | 5.3z8 - 5.3.8 |
| RHCS/5 | Bugzilla This content is not included.2224351 | Errata RHBA-2023:4760 | 5.3z5 - 5.3.5 |
| RHOSP/17 | Bugzilla This content is not included.2229931 | Errata RHBA-2024:0209 | 17.1 |
Diagnostic Steps
-
In
/var/log/containers/stdouts/haproxy-bundle.log, we can see HAproxy failing to start trying to bind{vip}:{Alertmanager_port}:2024-03-11T11:41:23.975991313+01:00 stderr F [ALERT] 070/114123 (7) : Starting proxy ceph_alertmanager: cannot bind socket [192.168.3.213:9093]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.