Java application unresponsive

Solution Verified - Updated

Environment

  • Java
  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 8.x
    • 7.x
    • 6.x
    • 5.x
    • 4.x
  • Red Hat JBoss Fuse
    • 6.0
  • Tomcat
    • 6.0
    • 5.5

Issue

  • JBoss freezes and brings down our production environment.
  • JBoss unresponsive.
  • JBoss crashing.
  • JBoss hangs.
  • JBoss CPU usage 100%.
  • GC thrashing.
  • High load.
  • High CPU.
  • CPU 100% utilization.
  • Slow performance.
  • Serious performance issues with API calls
  • Hung JBoss threads.
  • JBoss goes to hung state.
  • Server stuck.
  • Load on Server
  • Slow response from application.
  • Throughout is quite low about 30-40MBps when application runs in JBoss domain mode.
  • Application deployed on EAP become slow, need to restart the service and then application is running normal, But after a while it is run slowly again.

Resolution

  • Rewrite application code to use less memory.
  • Resolve delays upstream or in external systems.
  • Remove instrumentation.
  • Decrease memory pressure to avoid swapping.
  • Ensure the -XX:+TieredCompilation JVM option is not used
  • See Resolutions for the documents listed in the Root Cause section.

Root Cause

Diagnostic Steps

  • Review application logging at the time of the issue.

  • Determine if there is anything upstream of the application that could be the main cause (e.g. JBoss fronted by Apache and mod_jk). Test if the issue goes away when accessing the application directly (e.g. JBoss on port 8080).

  • Determine if the application is waiting on an external system or data store to become available:

  • Verify that the JVM is really unresponsive and in fact did not crash. Verify that the Java process is still alive (e.g. using jps -lv, ps. etc.). Sometimes the term "unresponsive" is used when really JBoss is down due to the JVM crashing. See Java application down due to JVM crash.

  • Verify that the Java process is still running (R or S state) by ps aux command. For example, jstack -F <pid> makes the target Java process "trace stop" (T) state.

  • Check the JVM options to see if any instrumenting agents are being specified with the -javaagent option. For example:

          -javaagent:/opt/jboss/wily/Agent.jar.
    

Agents can add overhead and cause issues. Test with the agent removed, if possible.

  • Check the JVM options to see if any debugging instrumentation is enabled. For example:

          -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n
    

Debugging adds a lot of overhead and should not be enabled in production. Test with debugging removed.

  • Determine if the bottleneck is the result of garbage collection activity. Enable garbage collection logging, and analyze it for the time period when the issue happens:

  • Determine if there is high cpu when the issue happens. If there is high cpu, or it is unknown, see Java application high CPU. By "high CPU" we mean sustained usage over 80%.

  • When the issue happens, get a series of thread dumps:

  • Analyze the thread dumps (How do I analyze a Java thread dump or javacore?):

    • Are there any monitors that do not have a locking thread, which indicates the monitor is locked in native code, typically due to GC activity?
    • Is there monitor contention or deadlocks?
  • On JBoss/Tomcat, determine if the http/ajp thread pools used are exhausted.

    • Check for the messages like the following on JBoss indicating a maxed out thread pool:

                 INFO  [JIoEndpoint] Maximum number of threads (10) created for connector with address /127.0.0.1 and port 8080
      
  • Check the number of current busy threads from the http/ajp pool being used.  Log in to the jmx-console and look for a link like the following for the connector in question:

          name=ajp-127.0.0.1-8009,type=ThreadPool
    

Follow the instructions in How to Access JBossWeb mbeans and their Related statistics in JBoss EAP6? and check the currentThreadsBusy attribute here.

How many threads are there in the dumps from this connector?&nbsp; Does it reach the maxThreads level?
  • If this is the case, either more threads are needed for the load or bugs are resulting in threading issues (deadlocks, lock contention, etc.) to stall the connector threads and exhaust the pool.

  • When does the unresponsiveness happen? After a period of time, after a particular use case, or does it appear random?

  • Has this application been in production a long time and only now this issue is happening, or was the application fairly recently deployed in production?

  • If JBoss Enterprise Application Platform (EAP) 5, check to see if symlinks are used, as it is a known issue that use of symbolic links leads to large disk usage due to Virtual File System (VFS) bug. See Use of symbolic links leads to large disk usage in JBoss EAP.

  • Test with logging set to ERROR to rule out logging issues.

  • If the issue results in slow http responses, log request time on the front end and back end to "see" the delay or to narrow it down to a layer. See How do I enable friendly access log times in Apache and JBoss?.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.