The JGroups Flow Control (FC) Protocol and Deadlocks with Web Session Replication in JBoss
Environment
- JBoss Enterprise Application Platform (EAP)
- 4.2
- 4.3
Issue
- Most or all JBoss http, https, or ajp threads appear to be stuck or blocked in the JGroups FC protocol:
"http-localhost-8080-19" daemon prio=1 tid=0x0000000043f460f0 nid=0x7d76 in Object.wait() [0x000000004aabd000..0x000000004aabec10] at java.lang.Object.wait(Native Method) at EDU.oswego.cs.dl.util.concurrent.CondVar.timedwait(CondVar.java:222) - locked <0x00002b5feef4a068> (a EDU.oswego.cs.dl.util.concurrent.CondVar) at org.jgroups.protocols.FC.handleDownMessage(FC.java:454) at org.jgroups.protocols.FC.down(FC.java:374) at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:499) at org.jgroups.protocols.FC.receiveDownEvent(FC.java:368) at org.jgroups.stack.Protocol.passDown(Protocol.java:533) at org.jgroups.protocols.FRAG2.down(FRAG2.java:167) at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:499) at org.jgroups.stack.Protocol.passDown(Protocol.java:533) at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:294) at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:499) at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:390) at org.jgroups.JChannel.down(JChannel.java:1230) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:792) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passDown(MessageDispatcher.java:769) at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:304) at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:444) at org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:193) at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:432) at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:192) at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.jboss.cache.TreeCache.callRemoteMethodsViaReflection(TreeCache.java:4484) at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4433) at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4386) at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4504) at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:110) at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:88) at org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:119) at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:88) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68) at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:379) at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:174) at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68) at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167) at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5934) at org.jboss.cache.TreeCache.put(TreeCache.java:3788) at sun.reflect.GeneratedMethodAccessor167.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155) at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94) at org.jboss.mx.server.Invocation.invoke(Invocation.java:86) at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264) at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659) at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210) at $Proxy175.put(Unknown Source) at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:138) at org.jboss.web.tomcat.service.session.JBossCacheService.putSession(JBossCacheService.java:325) at org.jboss.web.tomcat.service.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:125) - locked <0x00002b60227853b8> (a org.jboss.web.tomcat.service.session.SessionBasedClusteredSession) at org.jboss.web.tomcat.service.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1153) at org.jboss.web.tomcat.service.session.JBossCacheManager.storeSession(JBossCacheManager.java:702) - locked <0x00002b60227853b8> (a org.jboss.web.tomcat.service.session.SessionBasedClusteredSession) ...
-
The "IncomingPacketHandler (channel=Tomcat-CLUSTER-PROD)" thread is waiting to lock a org.jboss.web.tomcat.service.session.SessionBasedClusteredSession monitor held by an http, https, or ajp connector thread:
"IncomingPacketHandler (channel=Tomcat-CLUSTER-PROD)" daemon prio=1 tid=0x00002aaaad5b6e60 nid=0x73e6 waiting for monitor entry [0x000000004697b000..0x000000004697dc90] at org.jboss.web.tomcat.service.session.ClusteredSession.expire(ClusteredSession.java:818) - waiting to lock <0x00002b60227853b8> (a org.jboss.web.tomcat.service.session.SessionBasedClusteredSession) ... -
JBoss is unresponsive and "java.lang.OutOfMemoryError: Java heap space" appears in the JBoss
server.log. - The heap object histogram shows retention in org.jgroups.protocols.TP$IncomingQueueEntry.
- AJP Thread pool lockup, We have 2 servers in a cluster one of the servers has locked up the thread dump is below. The lock appears to be around the replication of HTTPSessions and in the FC protocol.
One server is logging repeatedly:
WARN [org.jgroups.protocols.FC] (IncomingPacketHandler (channel=Tomcat-PresentationPartition):) Received two credit requests from 10.252.1.10:34333 without any intervening messages; sending 1999936 credits
The other is logging repeatedly:
WARN [org.jboss.cache.TreeCache](ajp-sgz100-sap8%2F192.168.1.10-8009-6:) node /JSESSION/localhost/test?online/82ninyZauN+3eJieuTfVHQ** not found
Resolution
See Repeated JGroups Flow Control (FC) Warning Messages for potential causes and solutions of the FC blocking situation.
JBoss EAP 4.3.x
Upgrade to the latest cumulative patch release JBoss EAP 4.3 CP10 [1] or later, which contains the fix (Content from jira.jboss.org is not included.JBPAPP-4094).
JBoss EAP 4.2.x
Upgrade to the latest cumulative patch release JBoss EAP 4.2 CP010 or later if available, which contains the fix (Content from jira.jboss.org is not included.JBPAPP-4094). Otherwise request a one off patch for the issue.
Note:
But note that even with this fix preventing the deadlock, poor performance can still result from the root issue causing FC to block.
Also, there are full release zips and patch zips. The patch zips can be applied by replacing jars and updating xmls.
Diagnostic Steps
- Obtain the following
- a series of thread dumps from all nodes in the cluster
- How do I generate a Java thread dump on Linux/Unix?
- How do I generate a Java thread dump on Windows?
- if any node shows the
Tomcat-Cluster(name can change) channel'sIncomingPacketHandlerthread locking the same session object as one of the threads blocked in FC, then a deadlock has occurred.
log/server.logfrom all nodes in the cluster
- a series of thread dumps from all nodes in the cluster
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.