Prometheus pods remain in CrashLoopBackOff
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.11
- 4
- Prometheus
Issue
- Prometheus pods cannot start and remains in
CrashLoopBackOff. - Pod logs show the error
"Opening storage failed open /prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/meta.json: no such file or directory". - Prometheus pods are running out of memory.
Resolution
NFS or any other storage solution which is not a Block storage is not supported.
Root Cause
If cluster monitoring is using NFS or any other storage technology which is not a block storage, data corruptions to the time series database can happen.
OpenShift monitoring supports only Block storage as documented for OpenShift 3 and This page is not included, but the link has been rewritten to point to the nearest parent document.OpenShift 4.
Diagnostic Steps
-
Prometheus pods cannot start and remains in
CrashLoopBackOff. -
Pod logs show the error
"Opening storage failed open /prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/meta.json: no such file or directory". -
The directory
/prometheus/01HGTDKP8IYT1ELLOA8BYUYKAQ/exists but is empty. -
Cluster Monitoring is using an NFS storage as a data persistence backend.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.