Java application crashes with 'Segmentation fault'
Environment
- Red Hat JBoss Enterprise Application Platform (EAP)
- Linux
- CentOS
- Red Hat Enterprise Linux (RHEL)
- Sun/Oracle JDK
Issue
-
JBoss crashes with this error in the console output:
run.sh: line 283: 19837 Segmentation fault "$JAVA" $JAVA_OPTS -Djava.endorsed.dirs="$JBOSS_ENDORSED_DIRS" -classpath "$JBOSS_CLASSPATH" org.jboss.Main "$@" -
We have serious problems since we migrated our production environment from EAP 4.3.0 to EAP 5.0.0 last night. One or two of our cluster-nodes are dying with following message in console.log:
run.sh: line 283: 9076 Speicherzugriffsfehler "$JAVA" $JAVA_OPTS -Djava.endorsed.dirs="$JBOSS_ENDORSED_DIRS" -classpath "$JBOSS_CLASSPATH" org.jboss.Main "$@" -
The following error in the Linux
/var/log/messagesfile:kernel: java[17601]: segfault at 00000000498a2ca8 rip 00002aac2144459d rsp 00000000498a2c90 error 6 -
After one node in the cluster crashes, the other nodes crash in quick succession.
Resolution
-
In some cases, this segmentation fault from StackOverflowErrors has been avoided by setting a stack size of 5120k or higher. With these larger stacks, the actual error/stacktrace was printed instead of crashing with a seg fault.
-
Increase the
StackShadowPagesJVM setting so that more stack space is reserved for native code:-XX:StackShadowPages=20Note that will leave less stack for java level code and so the entire thread stack (-Xss) may need to be increased so that the amount of stack available to java level code is not decreased as a consequence.
Root Cause
- The Java thread stack size is being exceeded, and the JVM crashes instead of throwing
java.lang.StackOverflowError. This behavior seems to be specific to the Sun JDK on Linux, as with OpenJDK on Linux and Sun JDK on Windows the JVM does not crash andjava.lang.StackOverflowErroris thrown. - When all nodes in a cluster crash one after another, it is because the issue is request related (e.g. a use case is executed that results in deep recursion or an endless loop), and failover is propagating the request to other nodes. The request first brings down one node, then failover happens and the next node is brought down, and so on until all nodes are brought down.
- See java.lang.StackOverflowError.
- Java code has to share the stack with native code such as socketWrite. A portion of that stack will be reserved specifically for native code per the JVM's
StackShadowPagessetting. IfStackShadowPagesis too small, VM/native code calls could end up crashing with aStackOverflowwhen theStackShadowPagesspace is all that is left to it after java level recursion uses the rest. If theStackOverflowoccurs in the native layer call, then the JVM crashes (potentially without anhs_err) instead of providing a java levelStackOverflowexception. - Java crashes in SocketOutputStream.socketWrite0 from libnet.so
- [JVM Bug] Content from bugs.sun.com is not included.JDK-7059899 : Stack overflows in Java code cause 64-bit JVMs to exit due to SIGSEGV
Diagnostic Steps
- Verify a fatal error log was not created.
- Get a core dump when the issue happens and analyze the backtrace and/or jstack output to see what the JVM was doing at the time of the crash. See Java application down due to JVM crash.
- If core dump is not getting created, check the output from
ulimit -cis not 0. If it is, then you cannot create core files. To enable creation of core files you can do one of the following:- Run command
ulimit -c unlimitedfrom the terminal - This will be created for the specific user session/terminal and is not persisted on a server reboot - Configure core size in
/etc/security/limits.conf- Consult the system administrator about how to set this for the user running the Java application. - If jstack runs on the core dump a long time or seemingly without end, this further suggests that the JVM crashed due to a
StackOverFlowErrorstemming from very deep or endless recursion. - If jstack returns a
InvocationTargetExceptionorVMVersionMismatchExceptionconfirm the version jstack is being used usingstracecommand, which should met the version of the application. As instrace jstack.
- Run command
- If core dump is not getting created, check the output from
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.