Transaction abort and timeout messages in JBoss EAP

Solution Verified - Updated

Environment

  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 4
    • 5
    • 6
    • 7

Issue

The following warnings are reported during transaction processing:

  • EAP 6/7:
... WARN ... ARJUNA012117: TransactionReaper::check processing TX <xid> in state  RUN
... WARN ... ARJUNA012095: Abort of action id <xid> invoked while multiple threads active within it.
... WARN ... ARJUNA012108: CheckedAction::check - atomic action <xid> aborting with 1 threads active!
... WARN ... ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX <xid>

NOTE: In EAP versions before 7.4.0, ARJUNA012117 message will be output as TransactionReaper::check timeout for TX <xid> in state RUN.

  • EAP 5:
WARN  ... - TransactionReaper::check timeout for TX <xid> in state  RUN
WARN  ... - Abort of action id <xid> invoked while multiple threads active within it.
WARN  ... - CheckedAction::check - atomic action <xid> aborting with 1 threads active!
WARN  ... - TransactionReaper::doCancellations worker Thread[Thread-10,5,jboss] successfully canceled TX <xid>
  • EAP 4:
WARN  ... - Abort of action id <xid> invoked while multiple threads active within it.
WARN  ... - CheckedAction::check - atomic action <xid> aborting with 1 threads active!

Note: <xid> in the above represents the transaction id which will have the form of something like 7f000001:ece6:52ce0808:71 or 0:ffff7f000001:-1dde6ae2:52ccdf8c:b

  • Following or nearly concurrent with the WARN messages, application code may encounter exceptions similar to one or more of the following:
    • Could not enlist in transaction on entering meta-aware object!
    • java.lang.IllegalStateException: Transaction TransactionImple < ... status: ActionStatus.ABORTED > is not active STATUS_ROLLEDBACK
    • See also Unable to enlist resource ...

Resolution

These messages indicate that a transaction is timing out. The WARN is issued by an asynchronous JBossTS transaction "reaper" thread that aborts the timed-out transaction. There are a few ways to address this issue:

  • If a transaction is not required for a particular workflow, avoid starting a transaction
    • Container managed transaction (CMT) methods may be marked with @TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
      • Note that if a transaction is already in progress when such a method is called, suspending it will not stop the "clock" for the transaction and if not completed before the configured timeout period, it will fail.
  • Follow the JBoss EAP documentation examples for datasource configuration, being sure to include fault tolerance
  • Long running transactions are an anti-pattern and should be addressed by identifying and addressing the cause of long runtimes
    • Ensure that suitable memory, network bandwidth, etc. are available for the system load and traffic
    • Perform database tuning per vendor recommendations (apply indexes, address networking issues, etc.)
    • Address database lock contention
    • Refactor/optimize transaction logic to reduce work performed and runtime.
  • For transactions which cannot be optimized, increase the per-transaction timeout. This option should be considered preferable to changing the global configuration.

Diagnostic Steps

  • Analyze CPU, memory, disk usage, etc. for the application server and database
  • Analyze network traffic and bandwidth (e.g. between the application server and database)
  • Involve the DBA to investigate general database performance
  • Use runtime profiling tools to evaluate application server performance and identify costly application logic
  • Collect thread dumps before and after the timeout warnings and review the callstack for the thread(s) owning the transaction.
  • Enable transaction trace to identify and analyze long running transactions

Analyzing Transaction Logging

In EAP 7 (and later), ARJUNA012381 WARN messages at the point of reaper-initiated transaction abort (added as part of Content from issues.jboss.org is not included.JBTM-2476) may supply a thread stack for the application thread engaged in the transaction at the point the transaction was interrupted by the transaction reaper. This may provide insight into the cause of the long running transaction (e.g. blocking on external or internal resources, long running queries, etc.).

If enabled, DEBUG logging can provide insight into transaction history as discussed below.

Note: the following log snippets were taken from EAP 4.3, log statements from current EAP releases will vary.

Transaction is begun at:

DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] BasicAction::Begin() for action-id <xid>

Immediately after that, the "reaper" inserts it into its list. The reaper is a class to record transactions with non-zero timeout values and terminate them once their timeout lapses.

DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] TransactionReaper::insert ( BasicAction: <xid> status: ActionStatus.RUNNING, 20 )

The reaper check happens at:

DEBUG [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TransactionReaper_2] - TransactionReaper::check - comparing <absoluteTime>

Just over X seconds later (the number <absoluteTime> is the absolute value of the timeout (e.g. 1246501057055 would be 2009-06-02 21:17:37,055)).

Immediately after which it terminates the transaction:

DEBUG [com.arjuna.ats.arjuna.logging.arjLogger] BasicAction::Abort() for action-id <xid>
WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.BasicAction_58] - Abort of action id <xid> invoked while multiple threads active within it.
WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action <xid> aborting with 1 threads active!

Note: In case that timeout happens while a transaction is suspended (TransactionManager.suspend()), the transaction will be rolled back after resume but those two WARN messages won't be logged.

An application developer will commonly want to identify the particular thread that was running the transaction which timed out in order to debug the actual cause of the timeout. To do this, ensure that the Log4J appender writing to the server's log is using "%t" in its ConversionPattern so that the thread name is recorded with each log entry. Once the transaction times out and the corresponding XID is identified that XID can be used to search back up through the log to see which thread started the transaction. It might also be useful to take regular thread dumps (see above) during this process which can be correlated with the thread name in the logs to get even more detail about what the thread was doing when the timeout occurred.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.