New eviction policy TinyLFU since RHDG 7.3

Solution Verified - Updated

Environment

  • Red Hat Data Grid (RHDG)
    • 7.3
    • 8

Issue

  • There is no possibility to configure the eviction strategy in DG 7.3 as it was in former version, what stategy is used and how it works?
  • What are the benefits of the TinyLFU eviction stategy?
  • There are unexpected evictions seen for a cache, what is the reason
  • A cache configured without perstitence but with expiration and eviction enabled will count down the size, is this a bug?

Resolution

LFU is a popular and efficient stategy to handle the cache memory eviction.
RHDG 7.3 will use Caffeine with TinyLFU algorythm to provide an efficent eviction, more details and comparision to other policies can be found in Content from github.com is not included.Caffeine Efficiency

But this algorythm honor typical cache access which is with less writes and many reads to entries. Each entry will get bonus points if accessed and also the age will taken into account.
If the cache is used like this TinyLFU will be efficient and evict entries which are not used.
But there are some edge cases where this can have unwanted side effects

  • High usage after added but abandoned after a while
    The bonus for access will be very high and the entry might get a bonus which will prevent from eviction because other entries might be used at a later point but will not get a better bonus
  • High number of writes, similar amount of reads for every entry or write once and read less or never
    If there are lots of entries which have the same access pattern it mean all entries will reach the same bonus and the age is taken into acount, so the oldest entries are kept and the newest entries are evicted because of that

As it is not recommended to use eviciton without persistence because of a possible loss of entries, see this article, this is counted as a performance issue 'only'.

But a very bad side effect of this will happen if used together with expiration and an access pattern with less reads.
Consider a configuration with eviction and expiration where the expiration is set with a lifespan and the interval for the expiration-reaper is long in comparision to the expected expirations.
In this case the expired entries are old and might have the same read count than the active entries. That cause that the already expired entries are not evicted but he active ones as the expired will have a better bonus due to its age.
So let explain it by an extreme example,
the configuration is like

  • eviction count=120
  • expiration lifespan 60sec but reaper disabled with lifespan=-1
  • no persistence to store evicted entries
  • entries added but not or less read
  • add an entry every second

what happen

  • 1st minute
    • size increase to 60, no eviction or expiration
  • 2nd minute
    • size stay ~60, no eviction, but the expiration is seen by size and the entries are not counted - BUT NOT removed
  • 3rd minute
    • size stay ~60, eviction start to count up, here it will hit expired entries
  • 4th minute
    • as the old entries are OLD Caffeine will not prioritize them for eviction
    • as no entry is read there is no difference in the used counter, therefore no higher vote in caffeine
    • the youngest not expired entries are evicted, which end in a count down to 1...3 for size because there are still ~120 entries in memory but expired and not counted
  • further
    • size might go up and down a bit over the time, but not get back to expected 60, depend on the working eviction

This extreme case will show how the eviction is working with that pattern, if there is a persistence configured the entries are not lost and the size count will be correct, but performance will drop as the active entries need to be reloaded from the store.
Consider that in worse case there are entries which expire within the reaper intervall still burden the memory. Also if there is a huge number of evictions and/or the reaper is busy or blocked this might make the eviction even worse.

To prevent from this scenario the expiration the interval should be shorter

Root Cause

With RHDG 7.3 the eviction rely on caffeine which uses TinyLFU (Last Frequent Used) approach. The strategy will give bonuses for usage and age to choose the best entries to evict.
This will have benefits in most of the use cases but can cause unexpected behaviour in edge cases as explained in the resolution.

Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.