JMS becomes unresponsive with threads waiting for MessagingPostOffice's ReaderLock

Solution Verified - Updated

Environment

  • Red Hat JBoss Enterprise Application Platform
    • JBoss Messaging
    • JBoss Remoting

Issue

  • Our JBoss messaging server becomes unresponsive.  The thread dumps show threads waiting to acquire MessagingPostOffice's ReaderLock, for example:

        "WorkerThread#18[127.0.0.1:1059]" prio=10 tid=0x000000005d766000 nid=0x1000 in Object.wait() [0x0000000048aea000]
           java.lang.Thread.State: WAITING (on object monitor)
             at java.lang.Object.wait(Native Method)
             - waiting on <0x00000000a4665e50> (a EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock)
             at java.lang.Object.wait(Object.java:485)
             at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)
             - locked <0x00000000a4665e50> (a EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock)
             at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:574)
    

    Other threads can get backed up behind these as well leading to widespread unresponsiveness.

Resolution

  • Identify the ReaderLock owner thread and determine why it is not progressing and releasing the lock.

  • The ReaderLock owner thread could likely be handling a client message delivery through MessagingPostOffice.routeInternal() and hanging on the subsequent socketWrite:

        "WorkerThread#14[127.0.0.1:50175]" prio=10 tid=0x000000005c708000 nid=0x590 runnable [0x0000000047dd9000]
           java.lang.Thread.State: RUNNABLE
             at java.net.SocketOutputStream.socketWrite0(Native Method)
             at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
             at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
             at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
             at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
             - locked <0x00000000c02a38a0> (a java.io.BufferedOutputStream)
             at java.io.DataOutputStream.flush(DataOutputStream.java:106)
             at org.jboss.jms.wireformat.ClientDelivery.write(ClientDelivery.java:93)
             at org.jboss.jms.wireformat.JMSWireFormat.write(JMSWireFormat.java:237)
             at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedWrite(MicroSocketClientInvoker.java:940)
             at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:575)
             at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:422)
             at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:141)
             at org.jboss.remoting.Client.invoke(Client.java:1858)
             at org.jboss.remoting.Client.invoke(Client.java:718)
             at org.jboss.remoting.Client.invokeOneway(Client.java:769)
             at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:829)
             at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:700)
             at org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1467)
             at org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1379)
             - locked <0x00000000c02fcd98> (a org.jboss.jms.server.endpoint.ServerSessionEndpoint)
             at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.java:327)
             - locked <0x00000000c034c078> (a java.lang.Object)
             at org.jboss.messaging.core.impl.RoundRobinDistributor.handle(RoundRobinDistributor.java:119)
             at org.jboss.messaging.core.impl.MessagingQueue$DistributorWrapper.handle(MessagingQueue.java:583)
             at org.jboss.messaging.core.impl.ClusterRoundRobinDistributor.handle(ClusterRoundRobinDistributor.java:79)
             at org.jboss.messaging.core.impl.ChannelSupport.deliverInternal(ChannelSupport.java:670)
             at org.jboss.messaging.core.impl.MessagingQueue.deliverInternal(MessagingQueue.java:506)
             at org.jboss.messaging.core.impl.ChannelSupport.handle(ChannelSupport.java:251)
             - locked <0x00000000c034c370> (a java.lang.Object)
             at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:2228)
             at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.route(MessagingPostOffice.java:515)
             at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.sendMessage(ServerConnectionEndpoint.java:777)
    

    If this is the case, it's likely an issue with the socket write timeout.  See JMS thread hangs on socketWrite during client message delivery for more information.

Root Cause

  • Some thread has acquired the MessagingPostOffice's ReaderLock and is now stalled up, preventing the progression of other threads.

Diagnostic Steps

  • Troubleshoot using thread dumps and additional steps mentioned in Java application unresponsive

  • Identify the thread holding the ReaderLock.  It's not exposed directly in the thread dumps but it can be determined.  MessagingPostOffice's lock is private, so it must be acquired within the MessagingPostOffice, meaning we can direct our investigation here.  The following MessagingPostOffice methods acquire this lock:

    • getQueuesForCondition
    • getBindingForQueueName
    • getBindingForChannelID
    • getState
    • handleAckAllReplicatedDeliveries
    • handleAddAllReplicatedDeliveries
    • waitForBindUnbind
    • getBindings
    • routeInternal

    Search the thread dump for threads in the middle of these methods.  This would indicate that they likely hold the ReaderLock until they complete that method invocation.

  • Once the ReaderLock owner thread has been identified, investigate further to determine why it is not progressing and releasing the lock.

  • If you need to determine what the problem hanging thread is writting to, turn trace logging on for org.jboss.messaging.core.impl.postoffice.MessagingPostOffice and ensure %t is in your log4j appender's ConversionPattern.  The %t option is critical as that logs the thread of the message, allowing you to correlate messages with the hanging thread from the thread dumps.  When the thread hangs, you should see a log message saying "Routing ref to queue" indicating which queue it was trying to write to, but no corresponding "Queue returned" message after it that indicates that it completed.

Components

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.