JBoss shutdown stalls on XARecoveryModule.waitForScanState

Solution Unverified - Updated

Environment

  • JBoss Enterprise Application Platform (EAP) 6.3.3 and later

Issue

  • We see JBoss shutdown operations hanging in XARecoveryModule.waitForScanState like so:
"ServerService Thread Pool -- 15" prio=10 tid=0x00007fbc84096000 nid=0x3ee in Object.wait() [0x00007fbde4372000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:503)
	at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.waitForScanState(XARecoveryModule.java:935)
	at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.removeXAResourceRecoveryHelper(XARecoveryModule.java:92)
	- locked <0x000000078cbc3b28> (a java.util.concurrent.atomic.AtomicInteger)
	at com.arjuna.ats.jbossatx.jta.RecoveryManagerService.removeXAResourceRecovery(RecoveryManagerService.java:117)
	at org.hornetq.jms.server.recovery.HornetQRegistryBase.stop(HornetQRegistryBase.java:59)
	at org.hornetq.ra.recovery.RecoveryManager.stop(RecoveryManager.java:90)
	at org.hornetq.ra.HornetQResourceAdapter.stop(HornetQResourceAdapter.java:298)
	at org.jboss.as.connector.services.resourceadapters.deployment.AbstractResourceAdapterDeploymentService.unregisterAll(AbstractResourceAdapterDeploymentService.java:188)
	at org.jboss.as.connector.services.resourceadapters.ResourceAdapterActivatorService.unregisterAll(ResourceAdapterActivatorService.java:129)
	at org.jboss.as.connector.services.resourceadapters.deployment.AbstractResourceAdapterDeploymentService$2.run(AbstractResourceAdapterDeploymentService.java:299)
	- locked <0x0000000770f5ce88> (a org.jboss.as.connector.services.resourceadapters.deployment.AbstractResourceAdapterDeploymentService$2)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
	at org.jboss.threads.JBossThread.run(JBossThread.java:122)

Resolution

  • Address any fatal exceptions seen in the Periodic Recovery thread

Root Cause

  • would be caused by a wait to transition from the "first pass" state to the "between passes" scan state, which should be set by the "Periodic Recovery" thread after it completes its first work pass. A prior fatal exception in the Periodic Recovery thread's periodicWorkFirstPass would leave the scan state in "first pass" to cause this hang.

Diagnostic Steps

  • Capture thread dumps and check for a dead or absent Periodic Recovery thread.
  • Reproduce with the following debug logging to better follow the Periodic Recovery thread's activity and the scan states:
            <logger category="com.arjuna.ats.jta">
                <level name="DEBUG"/>
            </logger>
            <logger category="com.arjuna.ats.arjuna">
                <level name="DEBUG"/>
            </logger>
  • Check logging for fatal exceptions logged by the periodic recovery thread during its periodicWorkFirstPass, for instance a CNFE trying to load mq classes:
13:33:55,743 ERROR [stderr] (Periodic Recovery) Exception in thread "Periodic Recovery" java.lang.NoClassDefFoundError: com/ibm/mq/connector/RecoveryXAResource
13:33:55,743 ERROR [stderr] (Periodic Recovery) 	at com.ibm.mq.connector.ResourceAdapterImpl.getXAResources(ResourceAdapterImpl.java:572)
13:33:55,743 ERROR [stderr] (Periodic Recovery) 	at org.jboss.jca.core.tx.jbossts.XAResourceRecoveryInflowImpl.getXAResources(XAResourceRecoveryInflowImpl.java:96)
13:33:55,743 ERROR [stderr] (Periodic Recovery) 	at com.arjuna.ats.internal.jbossatx.jta.XAResourceRecoveryHelperWrapper.getXAResources(XAResourceRecoveryHelperWrapper.java:51)
13:33:55,744 ERROR [stderr] (Periodic Recovery) 	at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:510)
13:33:55,744 ERROR [stderr] (Periodic Recovery) 	at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:176)
13:33:55,744 ERROR [stderr] (Periodic Recovery) 	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:747)
13:33:55,744 ERROR [stderr] (Periodic Recovery) 	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:375)
13:33:55,744 ERROR [stderr] (Periodic Recovery) Caused by: java.lang.ClassNotFoundException: com.ibm.mq.connector.RecoveryXAResource from [Module "deployment.wmq.jmsra.rar:main" from Service Module Loader]
13:33:55,744 ERROR [stderr] (Periodic Recovery) 	at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:213)
13:33:55,745 ERROR [stderr] (Periodic Recovery) 	at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:459)
13:33:55,745 ERROR [stderr] (Periodic Recovery) 	at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:408)
13:33:55,745 ERROR [stderr] (Periodic Recovery) 	at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:389)
13:33:55,745 ERROR [stderr] (Periodic Recovery) 	at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:134)
13:33:55,745 ERROR [stderr] (Periodic Recovery) 	... 7 more
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.