Prometheus pods remain in CrashLoopBackOff

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 3.11
    • 4
  • Prometheus

Issue

  • Prometheus pods cannot start and remains in CrashLoopBackOff.
  • Pod logs show the error "Opening storage failed open /prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/meta.json: no such file or directory".
  • Prometheus pods are running out of memory.

Resolution

NFS or any other storage solution which is not a Block storage is not supported.

Root Cause

If cluster monitoring is using NFS or any other storage technology which is not a block storage, data corruptions to the time series database can happen.

OpenShift monitoring supports only Block storage as documented for OpenShift 3 and This page is not included, but the link has been rewritten to point to the nearest parent document.OpenShift 4.

Diagnostic Steps

  • Prometheus pods cannot start and remains in CrashLoopBackOff.

  • Pod logs show the error "Opening storage failed open /prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/meta.json: no such file or directory".

  • The directory /prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/ exists but is empty.

  • Cluster Monitoring is using an NFS storage as a data persistence backend.

Components

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.