Elasticsearch alert AggregatedLoggingSystemCPUHigh in OCP 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Red Hat OpenShift Logging
    • 5

Issue

  • Elasticsearch alert AggregatedLoggingSystemCPUHigh is triggered even when setting limits.cpu, but not all cpu used
  • Able to see in alertmanager the alert for Elasticsearch AggregatedLoggingSystemCPUHigh, even when reviewing the metrics, not all cpu is used.

Resolution

The requests.cpu value should be adjusted to be a value close to the real CPU usage of the Elasticsearch pods.

The monitoring stack or the Elasticsearch Dashboard can be used to check the Elasticsearch pods CPU usage.

Root Cause

This issue is in relation with the explained in the article: " CPU Throttling even when the container does not reach its CPU Limit ".

Diagnostic Steps

  1. Check that one alert AggregatedLoggingSystemCPUHigh is triggered for Elasticsearch

  2. Verify that the log store component from the cluster logging stack was set with limits.cpu:

        $ oc -n openshift-logging get clusterlogging -o jsonpath="{.items[*].spec.logStore.elasticsearch.resources.limits.memory}"
    16Gi
    
  3. Verify in Prometheus or OCP Console that the Elasticsearch pods are not using all CPU.

Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.