Pods failing with error loading seccomp filter into kernel: errno 524 in OpenShift 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.12
- 4.13
Issue
-
There are pods in
CreateContainerError,ContainerStatusUnknownorPendingstatus. -
Checking the
eventson the namespace where the pod is failing or running theoc describe podcommand the following errors appears:Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:38Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1 Error: container create failed: time="2023-08-25T12:24:32Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524" Liveness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:43Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:34Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:33Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1 Error: container create failed: time="2023-08-25T12:01:54Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524"
Resolution
This bug was already fixed in OpenShift 4.13.10, which is based in RHEL 9.2, by errata RHSA-2023:5069, and also in OpenShift 4.12.47 and 4.11.57, which are based in RHEL 8.6, by errata RHSA-2024:0412.
IMPORTANT NOTE: if facing similar issue in newer releases that include the above fixes, please refer to high load and a lot of zombie processes in OpenShift 4 for additional troubleshooting.
Workaround (only for old 4.12 and 4.13 versions that does not include the fix)
The bug can be faced during the upgrade to a version that already includes the fix, and there is a workaround to increase the value of net.core.bpf_jit_limit that can be applied in two different ways. It is recommended to revert the workaround after upgrading to a version that includes the fix.
Note: in RHEL8 kernels, even specifying
2^64 - 1 == 18446744073709551615(i.e. maximum value of unsigned 64-bit int) will not result in-EINVALbut it is silently ignored. It is advisable to consider that the maximum allowable value may be capped at0x3F000000==1056964608(approximately 1GB, as future kernels may behave) as the maximum value.
First method
Changing the net.core.bpf_jit_limit directly into a node:
-
Connect to the impacted node using a debug pod (or using ssh if available):
$ oc debug node/[node_name] sh-4.4# chroot /host bash -
Increase the value of
net.core.bpf_jit_limit:[root@node_name /]# sysctl net.core.bpf_jit_limit=364241152Note: It will not last after node reboot. The default limit value should be
264241152. Therefore, the suggestion is to increase by1the biggest number unit. However, there could be situations where a bigger number may be required.
Second method
Changing the value of net.core.bpf_jit_limit to all nodes via MachineConfig. The MachineConfig should be removed after upgrading to a version that includes the fix.
-
Create a
MachineConfigresource including the following lines:spec: kernelArguments: - sysctl.net.core.bpf_jit_limit=364241152Note: Follow the Openshift documentation about using
MachineConfigto add kernel arguments to nodes.
Root Cause
This is a known kernel bug that introduced a seccomp memory leak.
Diagnostic Steps
-
Connect to the impacted node using debug pod (or using ssh if available):
$ oc debug node/[node_name] sh-4.4# chroot /host -
Check if the
bpf_jitis enabled:[root@node_name /]# cat /proc/sys/net/core/bpf_jit_enable -
Check the
bpf_jitlimit value:[root@node_name /]# cat /proc/sys/net/core/bpf_jit_limit -
Check if the
bpf_jitvalue is hitting thebpf_jit_limit:[root@node_name /]# cat proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}' -
If the
bpf_jitvalue is not hitting thebpf_jit_limit, please refer to high load and a lot of zombie processes in OpenShift 4 for additional troubleshooting.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.