Cgroups v2 in OpenJDK container in Openshift 4

Solution Verified - Updated

Environment

  • Red Hat Build of OpenJDK
    • 8
    • 11
    • 17
    • 21
  • Red hat OpenShift Container Platform (RHOCP)
    • 4.13+

Issue

  • Cgroups v2 in OpenJDK container in Openshift 4
  • Cgroups v2 in Red Hat containers
  • OOMKill in OCP 4.14, although it worked on OCP 4.13-

Resolution

Solution:

  • Since OpenJDK 8u372+ version detects cgroups v2.
  • Since OpenJDK 11.0.16+ version version detects cgroups v2.
  • Any OpenJDK 17 (or later release, like OpenJDK 21).

Failure to detect cgv2 will result in the container settings/boundaries not being detected and the host being used as boundaries, which will eventually lead to OOMkill by the cgroups container memory size.

OpenJDK container current detects the cgroups via upstreams OpenJDK native code (C/Cpp) code not via scripts. That differs from JBoss EAP images 7, which rely on scripts for detection.

Meaning, other products such as JBoss EAP and AMQ, do not necessarily rely on OpenJDK for cgroups v2 detection and might rely on their scripts for detection and customization. For JBoss EAP 7 details in cgroups version see EAP 7 images cgroups version. For detection cgv2, see Verifying Cgroup v2 Support in OpenJDK Images.

Root Cause

The cgroup version is a kernel feature, i.e. it is inherited from on the node (host system) the container (or the application) is running on. In containerized applications, the container inherits - not impose it and the container itself cannot change the cgroup version. All it does is detect it. If OpenJDK isn't at level 8u372 or higher and the host system is cgroups v2, it would fail to detect any settings. i.e. the cgroups provider wouldn't be cgroupv2 or cgroupv1 presenting the following output:

Operating System Metrics:
    No metrics available for this platform

So if you are seeing cgroupv1 for some Java application, it means either:
a- The node the container is running on cgroups v1
b- or the Node OCP is set for cgroups v2, but the version cgv2 is not being detected

Cgroups v1 vs v2

Cgroups v2 changed the hierarch from cgropus v1. For instance cpu_cpuset_cpus is removed in it's place are the cpuset family of values. So instead of having a cpu controller group like in cgroupsv1, they made a separate controller group called cpuset to accomplish the same task. Reference Content from www.kernel.org is not included.here.

Meaning cgroupsv2 has more modular controllers in this case cgroupsv2 has cgroup, cpu, cpuset, io, irq, memory, and misc controllers now and pids etc. Cgroupsv2 now has a modular hierarcy like so:

  • cgroupsv2
  • modules
    • module values

So for cpu_cpuset_cpus, which was one file in cgroupsv1, it's currently cpuset module with the various options in that module.

There is also changes on the reporting of the data itself, see the solution Java's memory consumption inside a Openshift 4 container for more details.

Diagnostic Steps

  1. For detection cgv2, see Verifying Cgroup v2 Support in OpenJDK Images.
  2. Run Java with -Xlog:os+container=trace instead of -XshowSettings:system for more details.
  3. For OCP Node, the following output will help:
CommandCgroups V1 OutputCgroups V2 Output
grep cgroup /proc/filesystemsnodev cgroupnodev cgroup, nodev cgroup2
$ stat -fc %T /sys/fs/cgroup/tmpfscgroup2fs

Example:

### CGV2
$ stat -fc %T /sys/fs/cgroup/
cgroup2fs
### CGV1
$ stat -fc %T /sys/fs/cgroup/
tmpfs
### CGv2:
grep cgroup /proc/filesystems
nodev   cgroup
nodev   cgroup2
### CGv1:
grep cgroup /proc/filesystems
nodev   cgroup
  1. The VM.info will bring the following:
container_type: cgroupv2 <-------------- cgv2

Cgroups details:

StatisticsMeaning
/sys/fs/cgroup/memory.currentCurrent usage
/sys/fs/cgroup/memory.maxMax usage
/sys/fs/cgroup/cgroup.statnr_descendants and nr_dying_descendants
/sys/fs/cgroup/memory.statMemory stats include allocation
/sys/fs/cgroup/cpuset.cpus.effectiveSet of CPU's a task can be run
/sys/fs/cgroup/cpu.maxNumber of cpus being used

Example

$ cat /sys/fs/cgroup/memory.stat 
anon 37994496 <--------------
…
sh-4.4$ cat /sys/fs/cgroup/cgroup.stat 
nr_descendants 0
nr_dying_descendants 0
...
sh-4.4$ cat /sys/fs/cgroup/memory.current 
40001536
sh-4.4$ cat /sys/fs/cgroup/memory.max  <--- provides the max container size limit in bytes
2147483648
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.