What's making etcd database so big in OpenShift?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3
Issue
- etcd database has to be frequently compacted and defragged
- etcd database reaches the 4G high water mark too often
- Master API and etcd are slow due to the database being too large
Resolution
Note: for OpenShift 4, please refer to etcd_object_counts metric only show OpenShift resources.
Use the following Prometheus query to check the etcd object count:
sort_desc( sum(etcd_object_counts) by (resource, instance))
It is not possible to arrange the data by project/namespace in Prometheus so to find a specific name with a lot of resources, refer to How to list number of objects in etcd?.
Root Cause
The metric etcd_object_counts can be used in OpenShift 3 to check the etcd object count
Diagnostic Steps
To review the contents of etcd without prometheus, please refer to the following Solution: How to list number of objects in etcd?.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.