OpenShift Data Foundation Upgrade Pre-Checks

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Red Hat OpenShift Data Foundations (RHODF)
    • 4

Issue

ODF upgrades usually occur after OpenShift Container Platform (OCP) upgrades, shown here: Red Hat OpenShift Data Foundation Supportability and Interoperability Guide. OCP upgrades can sometimes cause clock skews, pod disruption budget issues, PVCs still in use, etc., or in some instances, the OCP upgrade did not finish before the user proceeded to upgrade the ODF operator. Following these checks prior to the upgrade will decrease the likelihood of the ODF Operator failing the upgrade.

Resolution

Prerequisites:

To run Ceph commands, please enable the rook-ceph-tools pod by following the steps outlined in the Configuring the Rook-Ceph Toolbox in OpenShift Data Foundation 4.x solution.

Once the OCP upgrade is complete. Confirm the cluster operators have been upgraded successfully. Capture the output of $ oc get co to confirm that there are no error messages reported on the cluster operators (also shown in OCP UI under "Administration" -> "Cluster Settings"). Additionally, after allowing some time to pass once the masters/workers have rebooted/upgraded, proceed to upgrade the ODF Operator.

If there are error messages related to the cluster operators failing the OCP upgrade, identify which cluster operator it is. Match the cluster operator with the project/namespace with: $ oc projects, then perform the following:

  1. Switch into ID'd project with: $ oc project <project-name>

  2. Run $ oc get pods

  3. ID pods that are having issues e.g. Terminating CrashLoopBackOff Error Container Creating

  4. Delete the pods with: $ oc delete pod <pod-name>

That process usually fixes the issue; however, keep in mind what has just now taken place. The OCP upgrade issue preventing OCP from successfully upgrading, such as a Pod Disruption Budget (PDB) error, PVC still in use, etc. is resolved. OCP will continue its upgrade path. Please be patient and allow plenty of time post-OCP upgrade, then proceed to the Solution section of this article to perform the ODF Operator pre-checks. If OCP continues to fail the upgrade, please contact the Shift Install Upgrade team.

Solution:
Prior to the ODF upgrade, simply checking Ceph's health before the OCS upgrade will be crucial. It can be done so with the following:

  1. Go to OCP console, go to Home -> Overview and look for the green check mark next to Storage. That usually indicates HEALTH__OK in Ceph.

  2. To double-check Ceph's health (highly recommend), run the two following commands to ensure Ceph is in HEALTH_OK all PGs are active+clean:

Ceph Status

$ oc exec -n openshift-storage deployment/rook-ceph-tools -- ceph status

Example Output:

    health: HEALTH_OK   <----------------------------------------- CONFIRM HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 3h) <---------------------- CONFIRM MONs ARE IN QUORUM
    mgr: a(active, since 6h)
    mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-b=up:active} 1 up:standby-replay
    osd: 27 osds: 27 up (since 2h), 27 in (since 111m) <--------  !!!CONFIRM ALL OSDs ARE UP/IN!!!
    rgw: 2 daemons active (ocs.storagecluster.cephobjectstore.a, ocs.storagecluster.cephobjectstore.b)
 
  data:
    pools:   10 pools, 1136 pgs
    objects: 5.50M objects, 3.3 TiB
    usage:   9.9 TiB used, 43 TiB / 53 TiB avail
    pgs:     1136 active+clean   <--------------------------------- CONFIRM ALL PG'S ACTIVE+CLEAN
 
  io:
    client:   93 KiB/s rd, 2.0 MiB/s wr, 5 op/s rd, 29 op/s wr

If the above checks are made and everything appears to be healthy, then the likelihood of upgrading successfully with no issues has increased significantly, and the output in the next step should result in the storagecluster yeilding a phase Ready.

  1. Check the storagecluster and ODF "Conditions" prior to the upgrade.

    a. $ oc get storagecluster -n openshift-storage

Validate PHASE: Ready

NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   94m   Ready <--- Verify  2023-04-24T14:30:10Z   4.10.0

If the storagecluster.yaml is not in phase Ready, often times it is related to NooBaa/Multicloud Object Gateway (MCG) issues. If so, review the NooBaa Troubleshooting Guide Multicloud Object Gateway (MCG) - OpenShift Data Foundation to diagnose and fix the issue.

and

b. Navigate to the OCP Console GUI. Click on "Installed Operators" -> "OpenShift Data Foundations." Click the "Subscriptions" tab. Scroll down and review the subscriptions to confirm AllCatalogSourcesHealthy.

  1. If Disaster Recovery (DR) / multi-cluster is configured, please review the following solutions:

Upgrade sequence of Regional Disaster Recovery (Regional-DR) cluster in OpenShift Data Foundation

Upgrade sequence of Metro Disaster Recovery (Metro-DR) cluster in OpenShift Data Foundation

OpenShift Data Foundation upgrade in disconnected environment with oc-mirror

Known Issues:

  1. VMWare/Local Storage Operator (LSO) Users Only: Describe your local-pv PVs. If you see a device name such as sda, sdb, sdc, etc. in the Path: spec, open a support case with either the Shift-Storage or ODF team to fix the symlinks before hitting the rook-ceph-osd-X Pod Stuck in CLBO/init after Node Reboot/OCP Upgrade - OpenShift Data Foundation issue during an OCP upgrade. DO NOT ATTEMPT AN OCP UPGRADE OR REBOOT A STORAGE NODE UNTIL THIS IS FIXED.
$ oc describe pv -n openshift-storage local-pv-<name>

Example Output:

Name:              local-pv-<name>
<omitted-for-space>
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /mnt/local-storage/lso-volumeset/sdb <----------------------------- BAD SYMLINK, NEEDS TO BE UUID
Events:    <none>

or... to validate all of the local-pv resources under the SYMLINK column, run the following command:

$ oc get pv -o 'custom-columns=HOST-NAME:.metadata.labels.kubernetes\.io/hostname,PVCNAME:.spec.claimRef.name,SYMLINK:.spec.local.path,DEV-NAME:.metadata.annotations.storage\.openshift\.com/device-name' | grep -v none
  1. Check for a known bug. This bug has been linked to the loss of OSDs provisioned on older clusters:
    NOTE: jq and the rook-ceph-tools will need to be installed prior to running this command.
$ oc exec -n openshift-storage deployment/rook-ceph-tools -- ceph report  | jq -c '.osd_metadata[] | { OSD: .id, bluestore: .bluestore_min_alloc_size  }'

Good Output, 4K blocks are GOOD:

{"OSD":0,"bluestore":"4096"}
{"OSD":1,"bluestore":"4096"}
{"OSD":2,"bluestore":"4096"}

Bad Output, DO NOT UPGRADE, OPEN A SUPPORT CASE:

{"OSD":0,"bluestore":"65536"}
{"OSD":1,"bluestore":"65536"}
{"OSD":2,"bluestore":"65536"}

In the above "Bad Output," the OSD, when provisioned initially, were writing in 64K blocks. When upgrading, a transition to 4K blocks will occur, causing a mixture of data size blocks (corruption). This will cause the loss of OSDs, and in rare instances, data loss. The fix is for the OSDs to be redeployed one at a time, awaiting rebalancing in between replacements. It is recommended to open a Red Hat Support case to facilitate this process; however, for more context on this issue, review the ODF OSD goes into CrashLoopBackOff due to the following error: "bluefs enospc" solution.

  1. Multicloud Object Gateway (MCG) NooBaa Users:

a. If NooBaa is being used and the contents of NooBaa buckets are crucial/Production, follow the "Backup" procedures derived from one of the following solutions:

Perform a One-Time Backup of the noobaa-db-pg-0 Database for the Mulitcloud Object Gateway (NooBaa) for ODF v4.18 and Below

Perform a One-Time Backup of the Database for the Mulitcloud Object Gateway (NooBaa) in ODF v4.19+

b. ODF v4.14 -> ODF v4.15 Upgrades: During this particular upgrade NooBaa upgrades the db from postgresql-12 to postgresql-15. When the upgrade finishes, run the following commands to confirm the database upgrade succeeded.

Indicates a Failure to Upgrade:

$ oc get noobaa -n openshift-storage -o yaml

status:
  conditions:
<omitted-for-space>
    message: Noobaa is using Postgresql-12 which indicates a failure to upgrade to
<omitted-for-space>

The above message is a positive indication that there was a failure to upgrade and the Recovery NooBaa's PostgreSQL upgrade failure in OpenShift Data Foundation 4.15+ solution will need to be performed. For larger DBs, it is likely the user will need to expand the db PVC prior to upgrade. Please review the Expanding the db-noobaa-db-pg-0 PVC - OpenShift Data Foundation (ODF) v4.18 and Below solution.

c. ODF v4.18 -> ODF v4.19 Upgrades: Please review step 7(a) and perform a backup prior to the upgrade. The new architectural change from a single noobaa-db-pg-0 pod to the High Availability (HA), which is two noobaa-db-pg-cluster-X pods (primary/secondary), will occur. At this time, only minor issues have been recorded regarding this upgrade; however, please review the following article prior to upgrading to v4.19:

Standalone MCG Deployed via Quay Documentation Fails ODF v4.19 Upgrade

  1. If ODF Internal mode with Multus is configured
  • To check if ODF is using "multus" :

    # oc get storagecluster -o yaml | grep -A1 "network:"
      network:
        provider: multus  <<<----
    
  • 7.1 Before ODF upgrade to v4.16, make sure each worker node has a "shim" interface on the ocs-public multus network ( a shim nncp definitions for each worker node + a route on the NAD "ocs-public" ) , as stated on 8.2.1. Multus prerequisites
    NOTE. If this step is not completed, pods will fail to connect to osd and mon pods with errors like (from the journal logs or dmesg on ocp node where app pods are running trying to access ceph PVs)

    example with ocs-public  ==  172.16.8.0/24
     
    kernel: libceph: mon2 (2)172.30.8.110:3300 connect error
    kernel: libceph: ceph_tcp_connect failed: -101
    kernel: libceph: connect (2)172.16.8.126:6800 error -101
    kernel: libceph: ceph_tcp_connect failed: -101
    kernel: libceph: osd1 (2)172.16.8.126:6800 connect error
    
  • 7.2 After the ODF cluster (with Multus enabled) is successfully upgraded to v4.16, administrators must disable holder pods (deprecated since 4.16) by following: OpenShift Data Foundation - Disabling Multus "holder" pods
    NOTE1. If ODF was upgraded to ODF 4.17 and holder pods were not disabled yet, same procedure applies.
    NOTE2. In any case, holder pods must be disabled before ODF 4.18 upgrade.

  1. ODF upgrades containing Rook v1.15+: In ODF versions bundled with Rook v1.15 or higher, a change was introduced in the ClusterController logic that strictly enforces nodeAffinityand podAntiAffinity, whereas previous versions were more permissive of placement violations. It has been recorded that the rook-ceph-mon pods can be affected by two different scenarios (see below):

a. If the cluster has experienced a change in node architecture, such as storage node replacement, label changes, role changes, taints/tolerations, etc. Ensure the rook-ceph-mon pod placement matches the hosts/IPs in the rook-ceph-mon-endpoints configmap. The node column on pod -o wide output should match the data/mapping in the rook-ceph-mon-endpoints configmap:**

NOTE: For PVC-backed monitors, mapping will reflect null.

$ oc get pods -n openshift-storage -l app=rook-ceph-mon -o wide

$ oc get cm -n openshift-storage rook-ceph-mon-endpoints -o yaml

If the above [step 9a] outputs reflect an inconsistency between the mon placement and the rook-ceph-mon-endpoints configmap mapping/data values, please execute the How to Redeploy rook-ceph-mon Pods in OpenShift Data Foundation (ODF) solution. Once the solution is complete, revalidate this step.

b. A missing topologyKey for the rook-ceph-mon podAntiAffinity can cause the mons to be impacted.

$ oc get cephcluster -n openshift-storage ocs-storagecluster-cephcluster -o yaml | grep -A8 podAntiAffinity
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - rook-ceph-mon
          topologyKey: "" <----------- DO NOT UPGRADE, FIX FIRST!!!

If the above [step 9b] indicates a missing topologyKey for the rook-ceph-mon podAntiAffinity, please execute the topologyKey: "" Missing for the rook-ceph-mon podAntiAffinity in the ocs-cephcluster CR - OpenShift Data Foundation solution. Once the solution is complete, revalidate this step.

If there are any questions/concerns regarding the above, please open a case with Red Hat Support to address any concerns prior to upgrading ODF.

Root Cause

Since ODF upgrades generally occur following OCP upgrades, in some instances the OCP upgrade has cause clock skews, failed to complete, or is not complete e.g. hung up on a failed cluster operator.

Diagnostic Steps

$ ceph time-sync-status
{
    "time_skew_status": {
        "z": {
            "skew": 0,
            "latency": 0,
            "health": "HEALTH_OK"
        },
        "ac": {
            "skew": 0,
            "latency": 0.00051476226577585975,
            "health": "HEALTH_OK"
        },
        "ad": {
            "skew": -5.8758086181640636e-05, <-------------- skew
            "latency": 0.00032825447724227754,
            "health": "HEALTH_OK"
        }
    },
    "timechecks": {
        "epoch": 2042,
        "round": 286,
        "round_status": "finished"
    }
}


$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.10   True        False         False      26h
baremetal                                  4.12.10   True        False         False      325d
cloud-controller-manager                   4.12.10   True        False         False      325d
cloud-credential                           4.12.10   True        False         False      325d
cluster-autoscaler                         4.12.10   True        False         False      325d
config-operator                            4.12.10   True        False         False      325d
console                                    4.12.10   True        False         False      40d
control-plane-machine-set                  4.12.10   True        False         False      2d1h
csi-snapshot-controller                    4.12.10   True        False         False      154d
dns                                        4.12.10   True        True          False      325d    DNS "default" reports Progressing=True: "Have 8 available DNS pods, want 9."
etcd                                       4.12.10   True        False         False      325d
image-registry                             4.12.10   False       True          True       29h     NodeCADaemonAvailable: The daemon set node-ca has available replicas...
ingress                                    4.12.10   True        False         False      26h
insights                                   4.12.10   True        False         False      2d1h
kube-apiserver                             4.12.10   True        False         False      325d
kube-controller-manager                    4.12.10   True        False         False      325d
kube-scheduler                             4.12.10   True        False         False      325d
kube-storage-version-migrator              4.12.10   True        False         False      29h
machine-api                                4.12.10   True        False         False      325d
machine-approver                           4.12.10   True        False         False      325d
machine-config                             4.12.10   True        False         True       23h     Failed to resync 4.12.10 because: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool worker is not ready, retrying. Status: (pool degraded: true total: 6, ready 1, updated: 1, unavailable: 1)]
marketplace                                4.12.10   True        False         False      325d
monitoring                                 4.12.10   True        False         False      30h
network                                    4.12.10   True        True          True       325d    DaemonSet "/openshift-multus/multus-additional-cni-plugins" rollout is not making progress - last change 2023-04-13T23:37:18Z
node-tuning                                4.12.10   True        False         False      2d
openshift-apiserver                        4.12.10   True        False         False      60d
openshift-controller-manager               4.12.10   True        False         False      2d
openshift-samples                          4.12.10   True        False         False      2d
operator-lifecycle-manager                 4.12.10   True        False         False      325d
operator-lifecycle-manager-catalog         4.12.10   True        False         False      325d
operator-lifecycle-manager-packageserver   4.12.10   True        False         False      239d
service-ca                                 4.12.10   True        False         False      325d
storage                                    4.12.10   True        False         False      325d
SBR
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.