Backend performance requirements for OpenShift etcd
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.11
- 4
- etcd
Issue
-
etcd performance can be impacted by poor storage and network performance, causing multiple errors:
$ oc logs --follow=true etcd-ocp4-9wwcf-master-0 -c etcd -n openshift-etcd ... etcdserver: failed to send out heartbeat on time (exceeded the 100ms timeout for xxx ms) etcdserver: server is likely overloaded etcdserver: read-only range request "key:\"xxxx" count_only:true " with result "xxxx" took too long (xxx s) to execute etcdserver: read-only range request "key:\"xxxx" count_only:true " with result "xxxx" took too long (xxxx ms) to execute etcdserver: read-only range request "xxxx" with result "xxxx" took too long (xxx ms) to execute wal: sync duration of xxxx s, expected less than 1s
Resolution
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.
Most commonly, issues with etcd occur as a result of one (or several) of the following:
- Slow storage
- CPU overload
- etcd database size growth.
Applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 200 milliseconds, etcd will warn that entries are taking too long to apply (took too long messages in the logs).
etcd metrics
The recommended way to check the etcd performance behavior over the time is to check the etcd metrics exposed. Some examples:
-
To rule out a slow disk from causing etcd warnings, monitor the metrics
etcd_disk_backend_commit_duration_seconds_bucket(99th percentile (p99) duration should be less than25ms) andetcd_disk_wal_fsync_duration_seconds_bucket(p99 duration should be less than10ms) to confirm the storage is reasonably fast. -
The overall etcd cluster latency comes from two values:
network RTTlatency: Big network latency and packet drops can also bring an unreliable etcd cluster state, so network health values (RTT and packet drops) should be monitored. You can monitor metricetcd_network_peer_round_trip_time_seconds_bucket(p99 duration should be less than50ms).- Storage IO latency: investigate the storage performance using the
fiotools. See below for additional information.
-
Refer to how to graph etcd metrics using Prometheus to gauge etcd performance in OpenShift.
-
Additional information about those and other etcd metrics in This page is not included, but the link has been rewritten to point to the nearest parent document.recommended etcd practices
-
More information about general OpenShift metrics in documentation, section: This page is not included, but the link has been rewritten to point to the nearest parent document.Cluster Monitoring.
etcd database
For database size-related issues please refer to:
- This page is not included, but the link has been rewritten to point to the nearest parent document.Defragmenting etcd data (OCP 4 documentation)
- How to defrag etcd to decrease DB size in OpenShift 3
- How to defrag etcd to decrease DB size in OpenShift 4
Additional information
Note: Beware that performance measurement may have significant impact on cluster health in case of existing performance issues, that said, proceed with these tests with care on production workload. Non-intrusive measurements can be get from exposed etcd metrics.
Refer to the article for etcd guidelines with OpenShift Container Platform 4 for additional information. More details about etcd performance can be found in upstream documentation: Content from etcd.io is not included.etcd performance FAQ.
Disk performance trobleshooting with
fio
Detailed information about using fio tool for etcd performance investigation can be found in the following articles:
IMPORTANT NOTE: The
fiotest is a sort test executed at specific moment. It can show if the disk is not fast enough to support the etcd requirements, but as other loads in the disk could affect the etcd performance in the long term, causing it to not behaves correctly, it is not recommended to only trustfioresults. It is recommended to check the etcd metrics for several hours and even days instead, to know the real etcd behavior for longer time as explained in how to graph etcd metrics using Prometheus to gauge etcd performance in OpenShift.
- Content from www.ibm.com is not included.Using
fioto tell whether storage is fast enough for etcd. - How to Use
fioto Check etcd Disk Performance in OCP
Root Cause
Clustered etcd is extremely sensitive to storage and network backend performance, and can be easily disrupted by any underlying bottlenecks.
Diagnostic Steps
Check etcd logs for the following messages:
$ oc logs --follow=true etcd-ocp4-9wwcf-master-0 -c etcd -n openshift-etcd
...
etcdserver: failed to send out heartbeat on time
etcdserver: server is likely overloaded
wal: sync duration of xxxx s, expected less than 1s
etcd logs can be viewed either from OpenShift Web console or using oc logs command-line tool.
- OpenShift Container Platform 3.11: etcd is located in
kube-systemproject - OpenShift Container Platform 4.x: etcd is located in
openshift-etcdproject.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.