Pod csi-addons-controller-manager OOMKilled - OpenShift Data Foundation (ODF)

Solution Verified - Updated

Environment

  • Red Hat OpenShift Data Foundations (RHODF) v4.10+

Issue

In some instances, usually high PVC counts, the csi-addons-controller-manager pod managed by the csi-addons-controller-manager CSV gets OOMKilled.

If by chance it's observed that the csi-cephfsplugin, csi-rbdplugin, csi-cephfsplugin-provisioner, csi-rbdplugin-provisioner, ocs-operator, ocs-metrics-exporter, odf-console, or odf-operator-controller-manager have a high number of restarts, please redirect attention to the Certain Pods Have High Restarts - OpenShift Data Foundation (ODF) solution.

Resolution

  1. Validate and note the name of the odf-csi-addons-operator subscription.
$ oc get sub -n openshift-storage
  1. Edit the odf-csi-addons-operator subscription with:
$ oc edit sub -n openshift-storage odf-csi-addons-operator-stable-<version>-redhat-operators-openshift-marketplace
  1. Increase the values to 800Mi under the config section:

Example:

$ oc edit sub -n openshift-storage odf-csi-addons-operator-stable-4.18-redhat-operators-openshift-marketplace <--- match name with output from step 1

  config:
    resources:
      limits:
        cpu: "1"
        memory: 512Mi      <---- Increase to 800Mi
      requests:
        cpu: 10m
        memory: 64Mi       <---- Increase to 800Mi

Root Cause

The default memory limit 512Mi could be too low for some installations.
See this This content is not included.bugzilla for further information.

Diagnostic Steps

  • csi addon pod log shows
    - restartCount: 55
      started: false
      ready: false
      name: manager <-------------------- Container
      state:
        waiting:
          reason: CrashLoopBackOff
          message: >-
            back-off 5m0s restarting failed container=manager
            pod=csi-addons-controller-manager-54594877db-4wm8t_openshift-storage(c0952297-a7dc-4093-ad02-ce1fe3d45b9c)
      imageID: >-
        registry.redhat.io/odf4/odf-csi-addons-rhel8-operator@sha256:8a7dfdfd9e851b0b68481726e9fb2946075d09385782aca5c6e70babc8763234
      image: >-
        registry.redhat.io/odf4/odf-csi-addons-rhel8-operator@sha256:c13cd4dbe18b4888a9be2dc1c94709d33735b4c030119daf50879808a3ab31f0
      lastState:
        terminated:
          exitCode: 137 <------------------------ Exit code for OOMKill 
          reason: OOMKilled <-------------------- OOMKill
          startedAt: '2024-03-22T17:48:59Z'
          finishedAt: '2024-03-22T17:49:26Z'
SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.