Expanding the db-noobaa-db-pg-0 PVC - OpenShift Data Foundation (ODF) v4.19+

Solution Verified - Updated

Environment

Red Hat OpenShift Container Platform (RCOCP) v4.x
Red Hat OpenShift Data Foundations (ODF) v4.19+
Red Hat Quay (RHQ) v3.x

Issue

The noobaa-db-pg-cluster-<1|2> PVC has become full, preventing the PostgreSQL server from starting.

Related Articles:
Expanding the db-noobaa-db-pg-0 PVC - OpenShift Data Foundation (ODF) v4.18 and Below
Change the Multi-Cloud Object Gateway Database's Collation Locale to C
How to Check the Size/Consumption of the PostgreSQL Database in the db-noobaa-db-pg-0 PVC

Resolution

Before starting, ensure adequate space is available in the ocs-storagecluster-cephblockpool, via ceph df, and Ceph is reporting HEALTH_OK with all PGs reporting active+clean via ceph status.

NOTE: If ceph is NOT reporting HEALTH_OK nor are all PGs reporting active+clean, please open a support case for further investigation.

$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph status -c /var/lib/rook/openshift-storage/openshift-storage.config

    health: HEALTH_OK            <---------------------- HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 41m)
    mgr: a(active, since 41m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 41m), 3 in (since 41m)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 92 objects, 138 MiB
    usage:   277 MiB used, 300 GiB / 300 GiB avail
    pgs:     97 active+clean      <---------------------- active+clean
 
  io:
    client:   1.2 KiB/s rd, 9.0 KiB/s wr, 2 op/s rd, 1 op/s wr


$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph df -c /var/lib/rook/openshift-storage/openshift-storage.config

--- RAW STORAGE ---
CLASS   SIZE    AVAIL     USED  RAW USED  %RAW USED
ssd    6 TiB  6.0 TiB  3.6 GiB   3.6 GiB       0.06
TOTAL  6 TiB  6.0 TiB  3.6 GiB   3.6 GiB       0.06
--- POOLS ---
POOL                                        ID  PGS   STORED  OBJECTS      USED  %USED  MAX AVAIL
ocs-storagecluster-cephblockpool             1   32  340 MiB      177  1021 MiB   0.02    1.7 TiB <----- cephblockpool

Procedures:

  1. In the event that this is a standalone NooBaa deployment where the db-noobaa-db-pg-0 PVC is NOT backed by the storageclass ocs-storagecluster-ceph-rbd, validate that volumeExpansion is supported. For example:
$ oc get sc <storageclass-name> -o yaml | grep -i expansion
allowVolumeExpansion: true <-----
  1. Make note of the noobaa-db-pg-cluster-1 and noobaa-db-pg-cluster-2 PVC capacities:
$ oc -n openshift-storage get clusters.postgresql.cnpg.noobaa.io noobaa-db-pg-cluster -o jsonpath={.spec.storage.size}
50Gi

$ oc get pvc  -n openshift-storage
NAME                     STATUS   VOLUME                                     CAPACITY
noobaa-db-pg-cluster-1   Bound    pvc-6191c66e-18fa-4792-8ccd-4838ded0fd03   50Gi
noobaa-db-pg-cluster-2   Bound    pvc-41af5072-163a-4dca-8827-ba574103209a   50Gi
  1. Scale down the NooBaa services:
oc -n openshift-storage scale deployment noobaa-operator noobaa-endpoint --replicas=0
oc -n openshift-storage scale sts noobaa-core --replicas=0

Example: Expanding volume from 50Gi to 100Gi.

WARNING: ONLY VOLUME EXPANSION IS ALLOWED. ROLLING BACK TO A SMALLER VOLUME SIZE IS NOT SUPPORTED. TAKE EXTRA PRECAUTION TO ENSURE THE DESIRED SIZE IS CORRECT.

  1. After pod termination of the NooBaa services (it takes time, don't force delete), patch the storagecluster.yaml using the command below and update the storage size to the desired size. The example below uses 100Gi.
$ oc patch -n openshift-storage storagecluster ocs-storagecluster --type merge --patch '{"spec": {"resources": {"noobaa-db-vol":{"requests":{"storage":"100Gi"}}}}}'

Validate:

  1. For the Ceph RBD storageclass, it should automatically resize after ~3 minutes. Once confirmed, no further action is needed.
$ oc get storagecluster -n openshift-storage -o yaml | grep -A2 noobaa-db-vol
      noobaa-db-vol:
        requests:
          storage: 100Gi

$ oc get pvc -n openshift-storage
NAME                     STATUS   VOLUME                                     CAPACITY
noobaa-db-pg-cluster-1   Bound    pvc-6191c66e-18fa-4792-8ccd-4838ded0fd03   100Gi
noobaa-db-pg-cluster-2   Bound    pvc-41af5072-163a-4dca-8827-ba574103209a   100Gi

NOTE: In most instances the PVC resizing will be successful. However, if the PVC resize appears to be unsuccessful, describe the PVC as the event below may appear, and provide further instructions:

$ oc get pvc/db-noobaa-db-pg-<pvc-name> -n openshift-storage -o yaml
...
    lastTransitionTime: "2025-05-30T20:24:05Z"
    message: Waiting for user to (re-)start a pod to finish file system resize of volume on node.
  1. If the NooBaa PVC expansion does not occur after some time, please restart the Cloud Native PostgreSQL manager pod:
$ oc delete pod -n openshift-storage -l app.kubernetes.io/name=cloudnative-pg
  1. Scale up the NooBaa services:
oc -n openshift-storage scale deployment noobaa-operator
oc -n openshift-storage noobaa-endpoint --replicas=1 <--- Modify to your HPA minimum pod specification
oc -n openshift-storage scale sts noobaa-core --replicas=1
  1. Once all pods have been in a Running state for ~3 minutes, validate that NooBaa is in a Ready phase:
$ oc get noobaa -n openshift-storage
NAME     S3-ENDPOINTS           STS-ENDPOINTS             IMAGE                          PHASE   AGE
noobaa   ["https://<omitted>"]  ["https://<omitted>"]     registry.redhat.io/<omitted>   Ready   46h


$ oc get backingstore -n openshift-storage
NAME                           TYPE       PHASE             AGE
noobaa-default-backing-store   <omitted>  Ready <----       35h

NOTE: Occasionally, NooBaa may still be in a Connecting phase and/or may not come to a Ready state. If this is observed after the above has been performed, please follow the steps in section 12.1. Restoring the Multicloud Object Gateway of the product documentation. Perform one final restart of the pods in the order shown, which will bring NooBaa back to a Ready phase.

Root Cause

When troubleshooting the noobaa-db resource, noobaa-db-pg-cluster-x PVC may become full, preventing the Postgres server from starting. Expanding noobaa-db-pg-cluster-1 and noobaa-db-pg-cluster-2 PVCs will allow Postgres server to start back up again to finish troubleshooting.

Diagnostic Steps

Review the pod logs for noobaa-db-pg-cluster-1-<pod-name> and noobaa-db-pg-cluster-2-<pod-name>

$ oc logs noobaa-db-pg-cluster-1-<pod-name>
waiting for server to start....2022-08-25 19:48:38.185 UTC [22] FATAL:  could not write lock file "postmaster.pid": No space left on device
 stopped waiting
pg_ctl: could not start server
SBR
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.