When a node has DiskPressure events, how to find which Pod is using the most local disk space ?

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP) 4.*

Issue

  • Where are stored Pod and container files ?
  • How to find pod disk usage usage ?
  • How to find the container usage ?

Resolution

  • pods are using storage on the node they are scheduled for three different things:
  • emptyDir volumes are backed on the host by a directory in /var/lib/kubelet/pods/${POD_ID}/volumes/kubernetes.io~empty-dir/${VOLUME_NAME}
  • the writable layer of running containers is in /var/lib/containers/storage/overlay/${long_container_id}
  • container logs are in /var/log/containers/*.log
  • together those storage locations are grouped in the concept of ephemeral storage

Diagnostic Steps

Get an overview of images and ephemeral storage consumption

  • review the DISK column of crictl stats
# (unfortunately the human-friendly size output prevents numeric sorting)
# crictl stats

Finding which Pod has the largest emptyDir volume on a node.

  • find the per pod emptyDir disk usage
# du -sk /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/ | sort -n | tail -1
61538052	/var/lib/kubelet/pods/162435e6-3ce1-493e-ad80-5485b288eb43/volumes/kubernetes.io~empty-dir/
  • find the pod with uid 162435e6-3ce1-493e-ad80-5485b288eb43
$ oc get pods -A -o custom-columns=PodName:.metadata.name,PodUid:.metadata.uid | grep 162435e6-3ce1-493e-ad80-5485b288eb43
hellokube-7c647c66c-wrn6p                                         162435e6-3ce1-493e-ad80-5485b288eb43

Finding which Pod has a container with the largest writable layer

  • find the per container writable layer disk usage
# note that the kilobyte value might be overinflated due to how overlayfs layers work
# du -sk /var/lib/containers/storage/overlay/*/merged  | sort -n | tail -1
87604880	/var/lib/containers/storage/overlay/467350bcc719c64933654e9393422de9134f576c1e56bf40eedc5d7d115658d7
  • find the container which is writing to this directory and its parent Pod
# crictl inspect  --output go-template --template '{{.status.metadata.name}}   {{index .status.labels "io.kubernetes.pod.name"}} {{.info.runtimeSpec.root.path}}' $(crictl ps -q)  | grep /var/lib/containers/storage/overlay/467350bcc719c64933654e9393422de9134f576c1e56bf40eedc5d7d115658d7

my-lovely-pi   hellokube-7c647c66c-wrn6p /var/lib/containers/storage/overlay/467350bcc719c64933654e9393422de9134f576c1e56bf40eedc5d7d115658d7/merged
  • from the ouput above, this means that the my-lovely-pi container, created by the hellokube-7c647c66c-wrn6p pod has the largest writable layer disk usage

Finding which container images are taking the most space

See How to check the storage used by images in /var/lib/containers/storage/

See Also

How to set usage limits for ephemeral storage in OpenShift 4
Monitor ephemeral storage consumption by individual pod in RHOCP

Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.