Disable automated etcd defragmentation in OpenShift Container Platform 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP) 4.9 and later

Issue

  • Starting with OCP 4.9, automated etcd defragmentation is enabled, causing short API interruptions and thus violating the API error budget.
  • The automated etcd defragmentation is impacting OCP availability due to constant defrag activity and therefore leader election happening.
  • Is it possible to disable automated etcd defragmentation in OCP 4.9 and newer?

Resolution

Starting with Red Hat OpenShift Container Platform 4.9.35 and 4.10.14 the automated etcd defragmentation can be disabled, following the below procedure.

Disable etcd defragmentation

  • Check if a ConfigMap called etcd-disable-defrag does exist in namespace openshift-etcd-operator.
    • oc get cm etcd-disable-defrag -n openshift-etcd-operator
    • If it does exist, automated etcd defragmentation is already disabled.
  • If the ConfigMap called etcd-disable-defrag does not exist in namespace openshift-etcd-operator, create it to disable automated etcd defragmentation.
    • oc create configmap etcd-disable-defrag -n openshift-etcd-operator

Diagnostic Steps

  • Check etcd-operator logs in namespace openshift-etcd-operator to see whether defragmentation conroller is running.
I0428 05:35:59.043851       1 defragcontroller.go:244] etcd member "master-0" backend store fragmented: 3.49 %, dbSize: 65781760
I0428 05:36:25.121044       1 defragcontroller.go:244] etcd member "master-1" backend store fragmented: 1.90 %, dbSize: 65847296
I0428 05:36:25.121148       1 defragcontroller.go:244] etcd member "master-2" backend store fragmented: 2.18 %, dbSize: 65978368
I0428 05:36:25.121393       1 defragcontroller.go:244] etcd member "master-0" backend store fragmented: 1.76 %, dbSize: 65781760
I0428 05:36:51.076194       1 defragcontroller.go:244] etcd member "master-1" backend store fragmented: 10.23 %, dbSize: 65847296
I0428 05:36:51.076213       1 defragcontroller.go:244] etcd member "master-2" backend store fragmented: 10.37 %, dbSize: 65978368
I0428 05:36:51.076217       1 defragcontroller.go:244] etcd member "master-0" backend store fragmented: 10.02 %, dbSize: 65781760
  • The above logs indicate that the defragmentation controller is currently running and no ConfigMap called etcd-disable-defrag does exist in namespace openshift-etcd-operator.
  • The absense of messages like etcd member "master-0" backend store fragmented: 10.02 %, dbSize: 65781760 does indicate that the defragmentation controller is likely disabled and that a ConfigMap called etcd-disable-defrag may exist in namespace openshift-etcd-operator.
SBR
Components
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.