Increased memory usage during JBoss EAP buddy replication state transfer

Solution Verified - Updated 7 Aug 2024

Environment

Red Hat JBoss Enterprise Application Platform (EAP)
- 4.x, 5.x
Clustered configuration with buddy replication

Issue

The following exception is found in the JBoss server log:

ERROR [org.jboss.cache.buddyreplication.BuddyManager] Caught exception handling view change org.jboss.cache.CacheException: java.lang.OutOfMemoryError: GC overhead limit exceeded

Killing one node in a cluster results in "OutOfMemoryError: GC overhead limit exceeded" or "OutOfMemoryError: Java heap space" on the remaining nodes.
We captured a heap dump and the bulk of the retention is in a single AsyncViewChangeHandlerThread, which holds it in a byte[] and/or an ExposedByteArrayOutputStream.

Resolution

The following are all possible solutions:

Increase Java heap (-Xmx).
Decrease session count/size (e.g. lower session timeout if sessions are short lived and load is even).
Disable session replication (typically not a good long-term solution in production).

Root Cause

When a node goes down, each node that had the down node as a buddy will have to find a new buddy. There is temporary increased memory on affected nodes as a result of the serialization, transfer, and deserialization needed to transfer state. After the redistribution, there is permanent increased memory as each node has to store more due to there being less nodes in the cluster among which to distribute state.
A node joining can cause the same issue as it shifts buddies and triggers replication.

Diagnostic Steps

Review server.log from all nodes.
Review the JVM options to see if it makes sense the heap could be undersized compared to the session cache for the load.
Troubleshoot with a heap dump and additional steps as described in Java application "java.lang.OutOfMemoryError: Java heap space"

SBR

Product(s)

Red Hat JBoss Enterprise Application Platform

Components

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.