Long G1 young pause collection due to external root scanning
Environment
- OpenJDK G1 collector
Issue
- Many long G1 young pause collections, one after another, due to external root scanning. For example:
12345.678: [GC pause (G1 Evacuation Pause) (young), 8.9540383 secs]
[Parallel Time: 8902.8 ms, GC Workers: 18]
[GC Worker Start (ms): Min: 42422201.9, Avg: 42422202.9, Max: 42422204.0, Diff: 2.1]
[Ext Root Scanning (ms): Min: 8881.9, Avg: 8887.2, Max: 8899.2, Diff: 17.4, Sum: 159969.9]
[Update RS (ms): Min: 0.0, Avg: 3.1, Max: 5.5, Diff: 5.5, Sum: 56.5]
[Processed Buffers: Min: 0, Avg: 8.4, Max: 37, Diff: 37, Sum: 152]
[Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 2.3, Avg: 9.0, Max: 11.5, Diff: 9.3, Sum: 161.1]
[Termination (ms): Min: 0.0, Avg: 1.4, Max: 1.8, Diff: 1.8, Sum: 24.7]
[Termination Attempts: Min: 1, Avg: 6.2, Max: 13, Diff: 12, Sum: 112]
[GC Worker Other (ms): Min: 0.1, Avg: 0.3, Max: 0.4, Diff: 0.3, Sum: 4.9]
[GC Worker Total (ms): Min: 8900.0, Avg: 8901.0, Max: 8902.1, Diff: 2.1, Sum: 160218.1]
[GC Worker End (ms): Min: 42431103.7, Avg: 42431103.9, Max: 42431104.1, Diff: 0.3]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 1.3 ms]
[Other: 49.9 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 1.5 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 4.8 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.7 ms]
[Eden: 748.0M(748.0M)->0.0B(744.0M) Survivors: 61440.0K->65536.0K Heap: 5553.3M(16192.0M)->4807.5M(16192.0M)]
[Times: user=10.50 sys=0.09, real=8.96 secs]
Root Cause
At the beginning of a young collection, the parallel GC worker threads scan external roots (objects allocated outside the Java heap) to find any that reach into the current collection set (CSet)[1].
Objects in the CSet referenced by an external root are "live" and not eligible to be collected.
External roots:
- Thread stack roots
- JNI
- Global variables
- System dictionary
synchronizedmonitors that have been inflated to an OS level monitor (e.g.BiasedLockingdisabled and there is contention on the lock)
The following are known issues that could cause the number of external roots to increase dramatically, causing scanning to take an unexpectedly long amount of time:
- Classloader generation (e.g. a leak or churn due to lack of reuse).
- Huge numbers of threads (e.g. thread leak).
Reference:
[1] This content is not included.Collecting and reading G1 garbage collector logs - part 2
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.