Pods failing with error loading seccomp filter into kernel: errno 524 in OpenShift 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.12
    • 4.13

Issue

  • There are pods in CreateContainerError, ContainerStatusUnknown or Pending status.

  • Checking the events on the namespace where the pod is failing or running the oc describe pod command the following errors appears:

        Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:38Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1
    
        Error: container create failed: time="2023-08-25T12:24:32Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524"
    
        Liveness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:43Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1
    
        Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:34Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1
    
        Readiness probe errored: rpc error: code = Unknown desc = command error: time="2023-08-25T13:00:33Z" level=error msg="exec failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524", stdout: , stderr: , exit code -1
    
        Error: container create failed: time="2023-08-25T12:01:54Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524"
    

Resolution

This bug was already fixed in OpenShift 4.13.10, which is based in RHEL 9.2, by errata RHSA-2023:5069, and also in OpenShift 4.12.47 and 4.11.57, which are based in RHEL 8.6, by errata RHSA-2024:0412.

IMPORTANT NOTE: if facing similar issue in newer releases that include the above fixes, please refer to high load and a lot of zombie processes in OpenShift 4 for additional troubleshooting.

Workaround (only for old 4.12 and 4.13 versions that does not include the fix)

The bug can be faced during the upgrade to a version that already includes the fix, and there is a workaround to increase the value of net.core.bpf_jit_limit that can be applied in two different ways. It is recommended to revert the workaround after upgrading to a version that includes the fix.

Note: in RHEL8 kernels, even specifying 2^64 - 1 == 18446744073709551615 (i.e. maximum value of unsigned 64-bit int) will not result in -EINVAL but it is silently ignored. It is advisable to consider that the maximum allowable value may be capped at 0x3F000000 == 1056964608 (approximately 1GB, as future kernels may behave) as the maximum value.

First method

Changing the net.core.bpf_jit_limit directly into a node:

  • Connect to the impacted node using a debug pod (or using ssh if available):

    $ oc debug node/[node_name]
    sh-4.4# chroot /host bash
    
  • Increase the value of net.core.bpf_jit_limit:

    [root@node_name /]# sysctl net.core.bpf_jit_limit=364241152
    

    Note: It will not last after node reboot. The default limit value should be 264241152. Therefore, the suggestion is to increase by 1 the biggest number unit. However, there could be situations where a bigger number may be required.

Second method

Changing the value of net.core.bpf_jit_limit to all nodes via MachineConfig. The MachineConfig should be removed after upgrading to a version that includes the fix.

  • Create a MachineConfig resource including the following lines:

    spec:
      kernelArguments:
        - sysctl.net.core.bpf_jit_limit=364241152
    

    Note: Follow the Openshift documentation about using MachineConfig to add kernel arguments to nodes.

Root Cause

This is a known kernel bug that introduced a seccomp memory leak.

Diagnostic Steps

  • Connect to the impacted node using debug pod (or using ssh if available):

    $ oc debug node/[node_name]
    sh-4.4# chroot /host
    
  • Check if the bpf_jit is enabled:

    [root@node_name /]# cat /proc/sys/net/core/bpf_jit_enable
    
  • Check the bpf_jit limit value:

    [root@node_name /]# cat /proc/sys/net/core/bpf_jit_limit
    
  • Check if the bpf_jit value is hitting the bpf_jit_limit:

    [root@node_name /]# cat proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
    
  • If the bpf_jit value is not hitting the bpf_jit_limit, please refer to high load and a lot of zombie processes in OpenShift 4 for additional troubleshooting.

Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.