JVM crashes in cfree+0x1c with GC log rotation enabled

Solution Unverified - Updated

Environment

  • OpenJDK 1.8u144
  • JBoss Enterprise Application Platform (EAP) 7.x

Issue

  • Our JVM is configured with GC log rotation (-XX:+UseGCLogFileRotation). It recently experienced a crash and it seems that occurred specifically when a GC log file filled and rotation occurred.
  • The hs_err_pid file shows a crash in cfree+0x1c from a ConcurrentGCThread:
Current thread (0x00007f936009b000):  ConcurrentGCThread [stack: 0x00007f93503f5000,0x00007f93504f5000] [id=12646]
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f92c1a75ff8
Stack: [0x00007f93503f5000,0x00007f93504f5000],  sp=0x00007f93504f3c28,  free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x8571c]  cfree+0x1c
  • A core dump backtrace shows the crash occurs in GC log activity:
#0  0x00007f26815e02c7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f26815e19b8 in __GI_abort () at abort.c:90
#2  0x00007f2680e7b8e9 in os::abort (dump_core=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1575
#3  0x00007f2681087056 in VMError::report_and_die (this=this@entry=0x7f267c764510) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1107
#4  0x00007f2680e858b5 in JVM_handle_linux_signal (sig=11, info=0x7f267c7647b0, ucVoid=0x7f267c764680, abort_if_unrecognized=<optimized out>)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#5  0x00007f2680e78aa8 in signalHandler (sig=11, info=0x7f267c7647b0, uc=0x7f267c764680) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4555
#6  <signal handler called>
#7  __GI___libc_free (mem=0x7f26821e3000) at malloc.c:2941
#8  0x00007f268162682a in __GI__IO_free_backup_area (fp=fp@entry=0x7f2638025b30) at genops.c:202
#9  0x00007f2681625f95 in _IO_new_file_overflow (f=0x7f2638025b30, ch=-1) at fileops.c:828
#10 0x00007f2681624900 in _IO_new_file_xsputn (f=0x7f2638025b30, data=<optimized out>, n=7) at fileops.c:1326
#11 0x00007f2681619692 in __GI__IO_fwrite (buf=0x7f267c764d20, size=1, count=7, fp=0x7f2638025b30) at iofwrite.c:39
#12 0x00007f2680e8822c in gcLogFileStream::write (this=0x7f2678003320, s=0x7f267c764d20 "416.724", len=7)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:830
#13 0x00007f2680e89c4e in print_raw (str=0x7f267c764d20 "416.724", this=0x7f2678003320) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.hpp:88
#14 stamp (this=0x7f2678003320) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:215
#15 stamp (suffix=0x7f26810c4307 ": ", prefix=0x7f26810e10d1 "", guard=<optimized out>, this=0x7f2678003320)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:225
#16 stamp (guard=<optimized out>, this=0x7f2678003320) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.hpp:103
#17 outputStream::gclog_stamp (this=0x7f2678003320, gc_id=...) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:251
#18 0x00007f2680a96fef in ConcurrentMarkThread::run (this=0x7f2678072000)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp:152
#19 0x00007f2680e7a832 in java_start (thread=0x7f2678072000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:846
#20 0x00007f2681fa7dd5 in start_thread (arg=0x7f267c765700) at pthread_create.c:307
#21 0x00007f26816a802d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Resolution

  • Upgrade to OpenJDK 1.8u262+
  • It seems most GC log rotation events will not crash for typical java application workloads. You may consider the following as mitigations to these rarer occurrences:
    • Disable GC logging altogether and accept one large GC log file (or manage log files externally with logrotate, splunk, etc.).
    • Set a larger -XX:GCLogFileSize to reduce the number of rotation events and the number of rare opportunities for such a crash.
    • Set -XX:NumberOfGCLogFiles=1 and accept logging loss if the log file reaches its max size.

Root Cause

A race condition between ConcurrentMarkThread and VMThread (which does gc file rotation):

Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.