JBossCache eviction queue fills up
Environment
- JBoss Enterprise Application Platform (EAP)
- JBoss Enterprise Portal Platform (EPP)
- JBoss Cache
Issue
I am using JBoss Cache and many threads on my server have become stuck waiting on JBoss cache to retrieve data from or put data into the cache. There are threads with stack traces like
EDU.oswego.cs.dl.util.concurrent.BoundedLinkedQueue.put(BoundedLinkedQueue.java:296)
org.jboss.cache.eviction.Region.putNodeEvent(Region.java:141)
org.jboss.cache.interceptors.EvictionInterceptor.doEventUpdatesOnRegionManager(EvictionInterceptor.java:149)
org.jboss.cache.interceptors.EvictionInterceptor.updateNode(EvictionInterceptor.java:122)
org.jboss.cache.interceptors.EvictionInterceptor.invoke(EvictionInterceptor.java:97)
org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
org.jboss.cache.interceptors.PessimisticLockInterceptor.invoke(PessimisticLockInterceptor.java:206)
org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
org.jboss.cache.interceptors.UnlockInterceptor.invoke(UnlockInterceptor.java:32)
org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:379)
org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:174)
org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:138)
org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5934)
org.jboss.cache.TreeCache.get(TreeCache.java:3656)
...
and possibly seeing warnings like
eviction node event queue size is at 98% threshold value of capacity: 200000 You will need to reduce the wakeUpIntervalSeconds
parameter
The data should be evicted by the cache, as it should be large enough to hold everything. Why is the eviction queue filling up?
Resolution
- If the eviction thread is dying, correct the exception that's causing it.
- Lowering the wake-up interval (
wakeUpIntervalSeconds) will ensure the eviction queue is cleared more frequently to avoid such conditions. Note that since it is somewhat difficult to determine which cache requires a lowerwakeUpIntervalSeconds, one could alter thewakeUpIntervalSecondsof all caches used within their JBoss evironment. - Increase the eviction eventQueueSize. EPP is dependent upon caching for several components (jcr, picketlink, lucene) and its not uncommon to require larger eviction queue sizes to avoid such blocking states.
- In this case they don't need cache expiry because it's all read-only data and not that big. It should be possible to stop events being sent to the eviction queue by writing a customer policy (or using a later version with the 'resident' flag):
package example.jboss.cache;
import org.jboss.cache.Fqn;
import org.jboss.cache.eviction.FIFOPolicy;
class MyEvictionPolicy extends FIFOPolicy {
public boolean canIgnoreEvent(Fqn fqn) {
return true;
}
}
Root Cause
- The "eviction queue" contains events which affect eviction - including records all get() and put() operations. This is because some eviction policies, such as Least Recently Used, need to know when a get() operation occurs so they can update the eviction order. As such the eviction queue can fill up even when the cache is large enough and exclusively holds read-only data.
- The eviction queue is processed by a thread which wakes up periodicly and processess all outstanding events in the queue. The period of this is controlled by the "wakeUpTimeSeconds" parameter in your *-cache-service.xml configuration file. If you are seeing problems caused by the queue filling up, you should try lowering the wake-up interval. There should be no noticable performance loss caused by lowering that down to 5 seconds or even 1 second.
- If the eviction queue thread dies, it will no longer process events off the queue and the queue will eventually fill up. In this case, there will be a log similar to:
ERROR [STDERR] Exception in thread "Timer-15" ERROR [STDERR] com.example.SomeException: xxx ... at org.jboss.cache.eviction.EvictionTimerTask.run(EvictionTimerTask.java:80) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462)
Diagnostic Steps
- Troubleshoot using thread dumps and additional steps mentioned in Java application unresponsive
Components
Category
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.