JBoss EAP and JBoss Fuse unexpected shutdown

Solution Verified - Updated

Environment

  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 8.x
    • 7.x
    • 6.x
    • 5.x
    • 4.3.0
  • JBoss Fuse
    • 6.x

Issue

  • Server is getting shutdown frequently or unexpectedly
  • Mysterious shutdown
  • Unexpected auto restart
  • All JBoss instances in our app environment shutdown automatically.
  • Jboss automatically stops without any changes.
  • JBoss EAP Stop with out any reason
  • JBoss EAP stops by itself

Resolution

The following are known resolutions:

  • Add the -Xrs JVM option to cause the Sun JVM to ignore SIGHUP and selected other OS signals. Please see here for a list of ignored signals (https://www.ibm.com/docs/en/ztpf/2019?topic=signals-used-by-jvm)

  • Running JBoss with nohup to prevent it being killed when the user logs out.

  • Increase available system memory and/or decrease memory usage to avoid triggering oom killer:

    • Decrease max heap size (-Xmx)
    • Decrease perm size (MaxPermSize)
    • Decrease thread stack size (-Xss)
  • Don't make shutdown calls programmatically over HA-JNDI or at least disable discovery to ensure improper nodes are not found and called for shutdown, for example:

    Properties props = new Properties();
    props.put(Context.INITIAL_CONTEXT_FACTORY,"org.jnp.interfaces.NamingContextFactory");
    props.put(Context.URL_PKG_PREFIXES,"jboss.naming:org.jnp.interfaces");
    props.put(Context.PROVIDER_URL,"server:1099");
    props.put("jnp.disableDiscovery", "true");
    InitialContext initialContext = new InitialContext(props);
  • Don't use Ctrl-Z after EAP start using your own start script.
    • If you did, the EAP will be shutdown unexpectedly after terminal timeout.
    • Moreover, if you did Ctrl-Z, the logging of EAP suddenly stop.

Root Cause

Diagnostic Steps

  • Check the JBoss server log for the following:

        INFO  [org.jboss.system.server.Server] Runtime shutdown hook called, forceHalt: true
    
  • This is the most common type of unexpected shutdown. It means that JBoss has received an explicit request to shut itself  down. It is not possible to tell from where the shutdown request originated; however, the most common  explanations are:

    • Accidental: Someone is shutting down JBoss without realizing they're doing it on the production system.
    • Logout: Someone is logging out of a console that they may not realize is hosting the JBoss process.
    • Odd interaction when running as a service. On some OS's and service methods, this can be called unintuitive.
  • Check for the following log message at the beginning of the shutdown:

        [org.jboss.system.server.Server] - Server exit(0) called
    
    • This is an indication that JBoss was stopped by a shutdown command from the JMX console accessed over the http port. The following message logged by JBossWeb http/ajp request threads from the ServerImpl class would clearly indicate that someone accessed the jmx-console and invoked the shutdown operation:

       INFO  [org.jboss.bootstrap.microcontainer.ServerImpl] (http-127.0.0.1-8080-3) Shutting down the server, blockingShutdown: false
      

You can use access logging (enabled in your server.xml) to clarify who sent that jmx console shutdown request.

  • This is an indication that JBoss was stopped by a shutdown command issued over RMI (through shutdown.sh or twiddle.sh scripts):

          2013-11-11 14:07:00,573 INFO  [org.jboss.bootstrap.microcontainer.ServerImpl] (RMI TCP Connection(2)-127.0.0.2) Shutting down the server, blockingShutdown: false
    
     *  The ip in the thread name is the client ip the shutdown request was received from.
     * If you see web or RMI shutdowns coming from unknown or improper IPs, consider blocking them with a firewall or changing the admin credentials required for shutdown (defined in `conf/props/jmx-console-roles.properties` and `jmx-console-users.properties` by default).  Also, consider if the shutdown is invoked via `ha-jndi`; if so disable discovery to ensure nodes are not improperly found and targeted for shutdown calls.
    
  • If it is suspected the shutdown is happening on logout, test adding the -Xrs JVM option to cause the Sun JVM to ignore SIGHUP.

  • If starting JBoss via the run.sh script, test running with nohup to prevent the process from being terminated when an SSH session or X window is closed or times out.

  • If running JBoss through a Tanuki service wrapper, check the wrapper.log for the following:

        JVM appears hung: Timed out waiting for signal from JVM.
    
    • This indicates the wrapper ping timed out so it shutdown JBoss automatically and restarted it.
  • If running JBoss on RHEL, check /var/log/messages from around the time of JBoss's death for messages like the following:

    kernel: Out of memory: Kill process 8916 (java) score 139 or sacrifice child
    kernel: Killed process 8916, UID 4051, (java)

    • This indicates available system memory became too low and so RHEL's oom killer was triggered. oom killer will pretty much just target and kill the largest process to free the most amount of memory, so when running JBoss and triggering oom killer, JBoss is commonly its victim.
  • Check if there is any Java source code in the application calling System.exit() or Shutdown.exit() to shutdown intentionally.

Other Debugging Options

DEPLOYMENTS MISSING DEPENDENCIES:
  Deployment "example:service=LoggingSignalHandler" is missing the following dependencies:
    Dependency "jboss:name=TomcatConnector,type=Barrier" (should be in state "Start", but is actually in state "Create")
  • If you scroll a few lines further down you should see a message that says "Installing LoggingSignalHandler..." which will confirm that the LoggingSignalHandler is working properly.

  • If it looks like some code is calling System.exit(), test the attached exit-permission-revoker.sar, which will prevent it from being called. NOTE this has only been tested on EAP 4.3, and has not been reviewed for security implications. It will also not work if a security manager is already being used.

  • Add the following to JAVA_OPTS "-Djava.security.manager -Djava.security.policy==<full path>/exit-permission-revoker.policy". Having no value on the first, and a double equals on the second, is not a typo, and the full path must be given.

  • If systemtap is available on the OS, use the attached parenttrace.stp to find out what is sending a signal to JBoss. See What is SystemTap and how do I use it?. Run the script as root. When the process is terminated, the relevant information will be displayed.

  • Configure audit log to track who/what is sending SIGNAL to a process. For example:

  • Using ByteMan

    1. Create an "exit.btm" file:

      RULE system.exit
      CLASS java.lang.System
      METHOD exit
      AT ENTRY
      IF TRUE
      DO traceStack()
      ENDRULE
      
      RULE shutdown.exit
      CLASS java.lang.Shutdown
      METHOD exit
      AT ENTRY
      IF TRUE
      DO traceStack()
      ENDRULE
      
    2. Download byteman from http://www.jboss.org/byteman/downloads

    3. Add to this to standalone.conf/domain.conf setting BYTEMAN_HOME to where you decompressed byteman, and using the correct path for the "script:" bit

      BYTEMAN_HOME=/path/to/byteman-dir
      JAVA_OPTS="$JAVA_OPTS -    javaagent:$BYTEMAN_HOME/lib/byteman.jar=script:/path/to/exit.btm,boot:$BYTEMAN_HOME/lib/byteman.jar,prop:org.jboss.byteman.transform.all=true"
      
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.