What do the multiple reported times mean when a Full GC occurs as the result of a concurrent mode failure?
Environment
- Java
- Concurrent Mark Sweep (CMS) collector GC logs
Issue
-
How to interpret concurrent mode failure timestamps in Garbage Collection (GC) logs
-
In the following log snippet, 1.264, is the CPU, while 72740 is what is classified as wall time', but at the end of the gc log entry, have another two elements about cpu time, that would be user and sys, do they have any relation with '1.264' I can find from the gc log entry(so does '72740.163' and '72729.22')? And now I have '1.264/72740.163' and 'Times: user=3.86 sys=0.00, real=2.86 secs' two block about, which of the block is more meaningful to me?
8477579.922: [Full GC 8477579.922: [CMS8477580.934: [CMS-concurrent-mark: 1.264/72740.163 secs] [Times: user=393.00 sys=43.13, real=72729.22 secs] (concurrent mode failure): 3075148K->594654K(3145728K), 2.8537840 secs] 4018892K->594654K(4089472K), [CMS Perm : 160302K->159744K(393216K)] icms_dc=22 , 2.8540090 secs] [Times: user=3.86 sys=0.00, real=2.86 secs] -
Need more information to understand / interpret the GC logs where concurrent mode failure is concerned
Resolution
The reason for there being more than one set of times in the log line is because there were multiple related events that occurred:
- There was a collection occurring in the background
- That collection was in the concurrent-mark phase
- It was aborted due to a concurrent mode failure
- That caused a Full Garbage Collection to occur
This means that you need to look at the times in terms of which particular event it was in reference to:
[Full GC 8477579.922:
[CMS8477580.934:
[CMS-concurrent-mark: 1.264/72740.163 secs]
[Times: user=393.00 sys=43.13, real=72729.22 secs]
(concurrent mode failure): 3075148K->594654K(3145728K), 2.8537840 secs
]
4018892K->594654K(4089472K),
[CMS Perm : 160302K->159744K(393216K)] icms_dc=22 , 2.8540090 secs
] [Times: user=3.86 sys=0.00, real=2.86 secs]
- The concurrent-mark phase used 1.264 seconds of CPU time over the period of 72740.163 seconds of real-time.
- The CMS collection (which that concurrent-mark phase was a part of) used 393 seconds of user time and 43.13 seconds of kernel time over a period of 72729.22 seconds of real time.
- That CMS collection was aborted during the concurrent-mark phase because of a concurrent mode failure (i.e. we ran our of free memory before it finished).
- The Full GC that resulted took 3.86 seconds of user time over the period of 2.86 seconds of real time. The reason for user > real is that it was using multiple CPU cores to do the work.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.