High CPU due to rapid looping in TransactionReaper.check

Solution Verified - Updated

Environment

  • JBoss Enterprise Application Platform (EAP)
    • 5.1.0 and earlier

Issue

  • We're seeing increased CPU on our JBoss server, leading to performance degradation and unresponsiveness.

  • Thread dumps and CPU data indicates the high CPU consumer is a Timer RepearThread, apparently stuck in TransactionReaper.check().

  • We are seeing our log fill up rapidly with messages like the following:

      DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] (Thread-16) TransactionReaper::check ()
      DEBUG [com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-16) [com.arjuna.ats.arjuna.coordinator.TransactionReaper_2] - TransactionReaper::check - comparing 1323173300843
    

Resolution

  • To resolve this bug, consume the fix in one of the following ways:
    • Upgrade to EAP 5.1.1, 5.1.2, or 5.2.0 as the transactions code base in these versions already appears to have the fix implemented (5.2.0 would be preferable if upgrading so that you are on the latest version).
    • Apply the This content is not included.JBPAPP-5193 one-off for EAP 5.1.0. This patch is a backport of the JBTM-794 fix.
    • Apply the This content is not included.JBPAPP-5834 one-off for EAP 5.1.0. This patch is a backport of the JBTM-794 fix bundled along with several other JBTM fixes. Here are the other fixes provided by this one-off:
      • JBTM-770 - incorrect cleanup registration causes memory leak
      • JBTM-804 - BA participant soapFault method log messages saying cancelling instead of compensating
      • JBTM-818 - Order in which JBoss nodes are restarted affects the outcome of XA transactions recovery
      • JBTM-821 - Cope With Zero Length Transaction Log Files
      • JBTM-823 - Xid recovery scans assume same list ordering
      • JBTM-824 - suppress orb sutdown warning

Root Cause

  • The ReaperThread calls the TransactionReaper.checkingPeriod() method to determine how long it should pause before it can invoke the TransactionReaper.check() method, however there exists a circumstance where the checkingPeriod() will return a negative period for a sustained time, resulting in the thread executing continuously. This occurs when the check() method is invoked while there are existing transactions in the ReaperElementManager.
  • Content from issues.jboss.org is not included.JBTM-794

Diagnostic Steps

  • Troubleshoot with thread dumps and CPU utilization data as described in Java application high CPU to identify high CPU consumers.
Components

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.