NAKACK "message <ip::messageID> not found in retransmission table" where messageID is too low

Solution Unverified - Updated

Environment

  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 4.x.x
    • 5.x.x

Issue

  • Message similar to the following occurs in the logs:

      WARN  [org.jgroups.protocols.pbcast.NAKACK] (OOB-831,<local_ip>:<port>) (requester=<requester_ip>:<port>, local_addr=<local_ip>:<port>) message <local_ip>:<port>::<message_id> not found in retransmission table of <local_ip>:<port>:        
      [10 : 13 (4) (size=4, missing=0, highest stability=10)]
    

Resolution

  • Identify and correct the root issue that caused the JVM to pause for an extended period of time.

Root Cause

A member of the cluster experienced a long JVM pause, was kicked out of the cluster, and was incorrectly merged back in.

The most common cause of the member being kicked is long garbage collection pauses.
Less common causes include the JVM creating a heap dump, virtual machine starving the process of resources, and extensive OS memory swapping.

Diagnostic Steps

If the messageIDs are slightly lower than the current range of messages, and never higher then this is the issue.
For example, with the following log, messageIDs up to 9, but never greater than 13 indicates this issue.

[10 : 13 (4) (size=4, missing=0, highest stability=10)]

If there are any logs with a higher messageID or the requested messages begin at 1, see NAKACK "message This content is not included.ip::messageID not found in retransmission table" where messageID is too high.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.