Troubleshooting OpenShift Container Platform 4.x: Storage
This article is part of the OpenShift Container Platform 4.X troubleshooting series
Index
- Basic data to gather and questions to ask
- Tests
- Others
- dd
- fio
- Tools
- oc rsync
- collectl
- sysstat
Basic data to gather and questions to ask:
- Depending on the storage provisioner used gather the provisioner pod logs in the storage provider project.
$ oc adm inspect ns/<project-name>
-
- If there are attach/detach controller events, check the
kube-controllerandkube-apiserverlogs.
- If there are attach/detach controller events, check the
-
An ocs-must-gather if "Openshift Container Storage" or "Openshift Data Foundation" product
- With OCS version 4.8 or below
# oc adm must-gather --image=registry.redhat.io/ocs4/ocs-must-gather-rhel8:v4.8 --dest-dir=<directory-name>
Note: Replace v4.8 to respective version.
- With ODF version 4.9 or higher.
# oc adm must-gather --image=registry.redhat.io/odf4/ocs-must-gather-rhel8:v4.11 --dest-dir=<directory-name>
-
sosreports from nodes where storage is being mounted / pod(s) being scheduled.
- Identify client messages (node journal looking for mounting of storage to node)
-
Identify
kubeletmessages (Thekubeletleverages client utilities to mount storage).
$ oc adm node-logs --role=master -u kubelet
$ oc adm node-logs --role=worker -u kubelet
-
Collect component(s) namespace events from during that time period.
-
Is the issue affecting workloads in all namespaces or only a subset?
Tests
-
Emulate the mounting of the storage. The PV object contains all of the information needed for OCP to pass to the provisioner/cloud provider to mount storage.
-
Perform a touch test from either within the problem pod or from a test pod with the same/similar PVC and PV configured and mounted.
Commonly used for permission issues to identify whether the configured id/gid has permissions to create a file on the FS path specified in thespec.container.volumeMountsof the pod. -
dd
Is a basic tool for testing reading and writing to/from a file. It gives very general performance data about throughput. It typically is only used as a starting point for gauging read/write performance. Theddcommand reads one block of input and processes it and writes it into an output that is specified. Example: Write an ISO file to a USB drive (/dev/sdb)- NOTE: Is is wrote to
sbdand notsbdN. The ISO file contains partitions! When the ISO is written to/dev/sdb(a device file), the partition metadata will also be there in the ISO and written to thesdbdevice. Runls /devafter running the dd command and it will returnsdb1, sdb2etc..
- NOTE: Is is wrote to
$ dd if=/path/to/ISO of=/path/to/device
Ex. $ dd if=/home/username/fedora34.iso of=/dev/sdb bs=512 status=progress
$ dd bs=1M count=400 if=/dev/zero of=test.dd conv=fsync
- fio
Content from linux.die.net is not included.Content from linux.die.net is not included.https://linux.die.net/man/1/fio - fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user. The typical use of fio is to write a job file matching the I/O load one wants to simulate.
Tools
-
oc rsync
In some cases, workloads in OCP are usingrsyncto read/write data from the backing storage.
Rsync is special because it uses delta functions that minimize the amount of data being sent across the network.- Rsync is a fast and versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote
rsyncdaemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specifications of the set of files to be copied. Rsync finds files that need to be transferred using an algorithm (by default) that looks for files that have changed in size or in last-modified time.
Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file's data does not need to be updated. - Very useful for backing up (or migrating) the data from persistent volumes!
- Rsync is a fast and versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote
-
Collectl
Gathers disk statistics over time. Not as granular asfio, but provides insights of whether time scoped issues are related to spikes in specific values collected bycollectl.- Default directory where
collectllogs are stored is/var/log/collectl. Dorshto the pod or node and gather these logs as they are gathered.
- Default directory where
-
sysstat/sar
In RHEL sosreports there is a plaintextsardata in/var/log/sa/sarN. These files are written at midnight, so to inspect the current days plaintextsardata, gather that data using sa1 or sa2. In future Red Hat coreOS releases,sysstatwill be configured by default.