How to list the number of objects and size in etcd on OpenShift?

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
  • 3.11
  • 4
  • etcd

Issue

  • We observe performance issues with etcd and would like to know what kind of objects are stored in etcd.
  • How can objects in etcd can be listed in OpenShift?
  • What is a normal amount of objects in etcd for OpenShift clusters?
  • How the size of any kind of objects in etcd can be checked?

Resolution

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Check the number of objects and their size in the etcd database, when they were created, and the namespaces in which they are with the following commands:

NOTE: the following commands use oc exec with an etcd pod to get the results. When it is not possible to use oc commands but an etcd pod is running, it is possible to use crictl exec -ti $(crictl ps --label "io.kubernetes.container.name=etcdctl" -q) within a control plane node instead, adding the commands from the sh -c (included) till the end of the command.

  • To check the size and number of occurrences of each kind of object in the etcd database with the following commands (command numfmt is included in coreutils package):

        $ export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd --field-selector="status.phase==Running" -o jsonpath="{.items[0].metadata.name}")
    
        $ oc exec -n openshift-etcd -c etcdctl ${ETCD_POD_NAME} -- sh -c 'etcdctl get / --prefix --keys-only  | grep -oE "^/[a-z|.]+/[a-z|.|8]*" | sort | uniq -c | sort -rn | while read KEY; do printf "$KEY\t" && etcdctl get ${KEY##* } --prefix --print-value-only | wc -c | numfmt --to=iec ; done | sort -k3 -hr | column -t'
        419   /kubernetes.io/operators.coreos.com                 24M
        1172  /kubernetes.io/secrets                              19M
        446   /kubernetes.io/configmaps                           11M
        5171  /kubernetes.io/events                               6.1M
        [...]
    

    This command will retrieve all keys from the etcd database, including the number of occurrences and the size by object type, ordered by size. Size 0 is shown if the command for specific key fails, but it does not mean the real size is 0.
    >Note: for OpenShift Container Platform 3.11, execute the above etcdctl command with etcdctl3 on the OpenShift Container Platform Master Nodes as root.

  • To check the size of some objects by namespace in the etcd database, use the following commands:

    $ export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd --field-selector="status.phase==Running" -o jsonpath="{.items[0].metadata.name}")
    
    $ oc exec -n openshift-etcd -c etcdctl ${ETCD_POD_NAME} -- sh -c 'etcdctl get / --prefix --keys-only  | grep -oE -e "^/kubernetes.io/secrets/[-a-z|.0-9]*/" -e "^/kubernetes.io/configmaps/[-a-z|.0-9]*/" -e "^/kubernetes.io/events/[-a-z|.0-9]*/" | sort -u | while read KEY; do printf "$KEY\t" && etcdctl get ${KEY##* } --prefix --print-value-only | wc -c | numfmt --to=iec ; done | sort -k2 -hr' | head -50 | awk -F'/' 'BEGIN{print "NAMESPACE TYPE SIZE"}{print $4" "$3" "$5}'| column -t
    

    In the above example, the bigger secrets, configmaps and events per namespace are shown. Additional objects can be added with an additional -e "^/kubernetes.io/[object_name]/[-a-z|.0-9]*/" (replacing the [object_name]).
    To check size of individual objects, refer to how to review the size of the largest secrets and configmaps in etcd on OpenShift?.
    >Note: for OpenShift Container Platform 3.11, execute the above etcdctl command with etcdctl3 on the OpenShift Container Platform Master Nodes as root.

  • To check only the number of occurrences for each kind of object in the etcd database, use the following commands:

        $ export ETCD_POD_NAME=$(oc get pods -n openshift-etcd -l app=etcd --field-selector="status.phase==Running" -o jsonpath="{.items[0].metadata.name}")
    
        $ oc exec -n openshift-etcd ${ETCD_POD_NAME} -- sh -c "etcdctl get / --prefix --keys-only | sed '/^$/d' | cut -d/ -f3 | sort | uniq -c | sort -rn"
           5171 events
           1172 secrets
            446 configmaps
            318 serviceaccounts
            295 rolebindings
            [...]
    

    This command will retrieve all keys from the etcd database and list how many objects there are by object type.
    >Note: for OpenShift Container Platform 3.11, execute the above etcdctl command with etcdctl3 on the OpenShift Container Platform Master Nodes as root.

  • To check when specific objects were created (by month, and more detailed by day and by hour):

        $ export OBJECT=[name_of_the_object_to_check]
    
        #### by month
        $ oc get ${OBJECT} -A -o jsonpath='{range .items[*]}{.metadata.creationTimestamp}{"\n"}{end}' | grep -oE "[0-9]{4}-[0-9]{2}" | sort | uniq -c
           1035 2024-11
        [...]
        #### by day
        $ oc get ${OBJECT} -A -o jsonpath='{range .items[*]}{.metadata.creationTimestamp}{"\n"}{end}' | grep -oE "[0-9]{4}-[0-9]{2}-[0-9]{2}" | sort | uniq -c
            613 2024-11-17
             81 2024-11-18
        [...]
        #### by hour
        $ oc get ${OBJECT} -A -o jsonpath='{range .items[*]}{.metadata.creationTimestamp}{"\n"}{end}' | grep -oE "[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}" | sort | uniq -c
             39 2024-11-17T06
            394 2024-11-17T07
        [...]
    
  • Check the specific responsible namespace(s) for specific objects with the following commands (change the [name_of_the_object_to_check] with the bigger object(s) from the output of the above commands):

        $ export OBJECT=[name_of_the_object_to_check]
    
        $ for NS in $(oc get ns --no-headers -o custom-columns=NAME:metadata.name); do echo -e "$(oc get ${OBJECT} -n ${NS} --no-headers | wc -l)\t${OBJECT} in ${NS}" ; done | sort -nr
        [...]
        673     secrets in argo-test
        171     secrets in openshift-kube-apiserver
        110     secrets in kube-system
        87      secrets in openshift-monitoring
        [...]
    
  • The metric kube-state-metrics is quite complete, and can be used for PromQL to query the count of most objects. Examples in the Content from github.com is not included.kube-state-metrics upstream documentation.

    Note: The amount of objects may vary between clusters depending on the usage, and the cluster version. The output can be interesting to see if there are object types with too many objects (for example millions of events) and if most of them are in specific namespaces.

Root Cause

It is possible to query the etcd database with command etcdctl from the etcd pods to retrieve the number and size of objects.

Diagnostic Steps

To review the same data using Prometheus metrics, please refer to solution What's making etcd database so big in OpenShift?.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.