Procedure to upgrade a RHEL 8 High Availability cluster to RHEL 9

Updated

Introduction

This document will describe how to do a rolling upgrade of a RHEL 8.8 High Availability cluster (or later) to RHEL 9.2 (or later). A rolling upgrade is one where each cluster node is temporarily removed from the running cluster, the operating system upgraded, and then put back into the cluster. As long as there are no constraints requiring a resource to run only on a specific node, resources will remain running.

This document will not describe how to perform the upgrade of the operating system itself. That is covered in the RHEL documentation: Upgrading from RHEL 8 to RHEL 9 Red Hat Enterprise Linux 9

Environment

  • Red Hat Enterprise Linux Server 8.8 (with the High Availability Add On) or later

Limitations

  • A rolling upgrade of a RHEL 8.8 High Availability cluster (or later) to RHEL 9.2 High Availability cluster (or later).
  • This procedure is currently NOT supported if the cluster nodes are using any packages that are part of Resilient Storage. Support for upgrading cluster nodes with Resilient Storage packages installed will be added at a future date.
  • The services managed by the cluster may have changed between RHEL 8 and RHEL 9 and need their own upgrade procedure performed. This could involve anything from configuration file changes to database layout changes. Service-specific upgrades are beyond the scope of this document.
  • All applications running on top of a cluster must be tested and validated for RHEL 9 before a production cluster is upgraded.
  • The procedure is only for upgrading of the cluster and any running applications may not work the same on RHEL 9 as they did on RHEL 8.
  • The procedure outlined below is for cluster (member) nodes. Remote nodes will have to be upgraded after cluster (member) nodes are upgraded.

NOTE: If using AWS images then a package should be updated before doing the migration in order to allow for a leapp upgrade. Update to leapp-rhui-aws-ha-1.0.18-1.el8.noarch, which is available in the rhui-client-config-server-8-ha repository. Once installed, the package allows upgrades using the leapp utility.

You may also wish to familiarize yourself with the upstream documentation on upgrading a pacemaker cluster: Content from clusterlabs.org is not included.Upgrading a Pacemaker Cluster — Pacemaker Administration

Procedure to upgrade from RHEL 8 to RHEL 9

  1. Upgrade the CIB on the running cluster. The following command can be performed on any cluster node:

    # pcs cluster cib-upgrade
    

    This will convert the CIB to the latest version supported by the installed RHEL 8 version of pacemaker. Certain clusters may be running the most recent version of the cluster software, but still using an older version of the CIB syntax. Upgrading the CIB will make it more likely that upgrading the system itself will succeed.

  2. Choose a single node to begin the upgrade. To help reduce downtime, we would recommend choosing a cluster node with the fewest number of services or a cluster node that is a passive cluster node for promotable pacemaker managed resources. If any preparations need to be made before stopping or moving the resources or software running on that node, carry out those steps now.

    Put the first node into standby mode:

    # pcs node standby <node name>
    

    This could take some time, depending on the resources involved. Monitor the status with pcs status until the resources are relocated. This will also prevent the cluster from starting any new resources on this cluster node.

  3. Before stopping the cluster node it is recommended that the systemd service pacemaker is disabled from starting at boot. Disabling pacemaker from starting at boot will prevent pacemaker from starting until verifying that all components managed by the cluster actually still work as expected.

    The cluster stack can be disabled from starting on boot on this chosen node with:

    # pcs cluster disable <node name>
    

    Enabling pacemaker to start at boot should only be enabled after it has been verified that the cluster is still able to manage all the clustered managed resources without issues.

  4. On the cluster node that will be updated, run the following command to stop the cluster software:

    # pcs cluster stop
    
  5. Update the operating system on the first node, following the steps in the RHEL upgrade document. This could take a long time, requiring running the leapp preupgrade steps to inspect the system, fixing any problems discovered, downloading and upgrading many packages, relabeling the filesystem for SELinux, and several reboots. In particular, dealing with the problems found by leapp could require a lot of manual intervention. Because this process will have to be repeated on every cluster node, it is recommended that you take detailed notes including complete command lines executed.

  6. After the first node is updated, perform the post-upgrade tasks to clean up the system as described in the RHEL upgrade document 7.1.

  7. Perform any service-specific upgrades required by your pacemaker managed resources or other services not managed by pacemaker.

  8. After a cluster node has been updated it must be manually verified that all components running on the cluster node that are managed by the cluster are still working as expected to avoid unnecessary failures to be triggered when the cluster node is a member of the cluster managing cluster resource when the cluster stack is started again on the cluster node

  9. Start the cluster software on the first cluster node:

    # pcs cluster start
    

    At this point, the node is now a part of the cluster again and cluster commands can be run, but it is still in standby mode. Resources will not be executed on the cluster node.

  10. Take the first cluster node out of standby mode:

    # pcs node unstandby <node name>
    

    Resources may be moved back to the node as required to fulfill load balancing, colocation or other constraints, etc.

  11. Repeat steps 2-10 above on all other cluster nodes, one at a time. Note that you only need to run step 1 cib-upgrade before upgrading the first cluster node (and again after all the cluster nodes are complete - see step 12 below). It does not need to be run on every cluster node.

  12. After all other cluster nodes have been upgraded, upgrade the CIB on the running cluster again:

    # pcs cluster cib-upgrade
    

    This will convert the CIB from the latest version supported by the RHEL 8 version of pacemaker to the latest version supported by the RHEL 9 version of pacemaker. Doing so enables the latest CIB features.

  13. After verifying that all pacemaker managed cluster resources are able to run on the cluster node then enable pacemaker to start at boot on all the cluster nodes.

    # pcs cluster enable --all
    
SBR
Category
Components
Article Type