How to defrag etcd to decrease DB size in Red Hat OpenShift 3?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.11
- etcd
Issue
-
etcdDB is large, how can one defragetcd? -
Following error message can be seen from etcd pods:
etcdserver: mvcc: database space exceeded -
The master API fails to start after hitting this issue.
Resolution
For OpenShift 4
Check solution how to defrag etcd in OpenShift to decrease DB size in OpenShift 4.
For OpenShift 3.11
-
Maintenance should be done monthly or bi-monthly to ensure
etcdhealth. -
When encountering the above message, the
etcdcluster may enter maintenance mode. The first step is to try pruning objects to free up storage used byetcd. -
If that doesn't work,
compacting/defragmentingetcd may help to free additional space. -
Before doing anything, ensure you take a snapshot(backup) of the etcd data.
-
To take a snapshot/backup, run the following on your
etcdhost:# Backup ETCD Configuration $ mkdir -p /etc/etcd/etcd-config-$(date +%Y%m%d)/ $ cp -R /etc/etcd/ /etc/etcd/etcd-config-$(date +%Y%m%d)/ # Backup ETCD Data $ etcdctl3 snapshot save /var/lib/etcd/etcd-$(date +%Y%m%d)-backup.db # Confirm backup $ ETCDCTL_API=3 etcdctl3 --write-out=table snapshot status /var/lib/etcd/etcd-$(date +%Y%m%d)-backup.db -
Gather endpoint status and source/set variables:
# source /etc/etcd/etcd.conf # export ETCDCTL_API=3 # ETCD_ALL_ENDPOINTS=`etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_LISTEN_CLIENT_URLS --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'` # etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_ALL_ENDPOINTS --write-out=table endpoint status -
Compact
etcdusing the samerevvalue for all members.# rev=$(etcdctl3 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*' -m1) # etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_ALL_ENDPOINTS compact $rev -
Then run the
defragcommand once for eachetcdmember.-
This command should be run against each member individually, setting one member at a time using the
--endpointsargument; also set the flag--command-timeout=30s. -
It is preferred to run the command against the leader last.
-
If a timeout occurs, increase the
--command-timeoutvalue until the command succeeds. -
Replace
$MEMBER_IPwith theetcdIP# MEMBER_IP=<etcd IP> # etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --command-timeout=30s --endpoints=https://$MEMBER_IP:2379 defrag # etcdctl3 alarm disarm
-
Root Cause
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.
The OpenShift installation Ansible playbook(3.7-3.11) sets "Content from github.com is not included.ETCD_QUOTA_BACKEND_BYTE" to Content from github.com is not included.4294967296(4G) by default. If one of the etcd members exceeds the quota, all cluster members will enter maintenance mode and only accept key reads and deletes.
Setting Content from github.com is not included.etcd space quota allows for better performance. Without it, etcd may suffer from poor performance when key-spaces grow excessively large, or storage is depleted. If your cluster contains a large number of objects or you create/delete a large number of objects, you should consider increasing this value and/or compacting and de-fragmenting the database periodically.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.