Deprecated service type Cache in DG 8 in OCP 4

Solution Verified - Updated

Environment

  • Red hat OpenShift Container Platform (OCP)
    • 4.x
  • Red Hat Data Grid (RHDG)
    • 8.x
    • Operator

Issue

  • What is the default collector for type Cache?
  • Why is type Cache being deprecated?
  • How to migrate?
  • What are the limitations for type Cache?

Resolution

Do not use service cache type Cache on production environments - at all - for several reasons.

Type Cache is set via spec.service.type.Cache (example below) and it is different than type DataGrid, which replaces it. The DataGrid service type continues to benefit from new features and improved tooling
to automate complex operations such as cluster upgrades and data migration, see Dg Operator Guide 8.3

Type Cache is the default type -spec.service.type.Cache is deprecated and should be avoided at all costs, use instead type DataGrid. However, it is the default (spec.service.type default is Cache). This is to maintain backwards compatibility on upgrades from other versions.
Also note the Cache type will use:

  • Set MaxRAM 420MB;
  • GC collector SerialGC, same as the Gossip Router: Regardless of the number of cpus, the type Cache will use Serial GC collector - that was for footprint reasons.
  • It will create a default cache; Whereas the type DataGrid the user has to define all of their caches.
  • In the cache service cache entities are stored as off-heap - by default
  • The type service cache comes with no encryption nor authentication:

Example:

- apiVersion: infinispan.org/v1
  kind: Infinispan
  metadata:
    name: ${CLUSTER_NAME}
    namespace: ${CLUSTER_NAMESPACE}
  spec:
    service:
      type: Cache

The type service cache comes with no encryption nor authentication:

spec:
 security:
    endpointAuthentication: false
    endpointEncryption:
      clientCert: None
      type: None

Migration to service type DataGrid

The default creation of a Infinispan Cr, will result by default in a service type Cache, rather than Data Grid.
So in case by mistake this was created, the user can set the replicas to zero and edit the Custom Resource directly (at section spec.service.type: Cache to spec.service.type: DataGrid). And the new will be spawned with the properties described above (MaxRAM 420Mb, SerialGc and such). However, for most of the times it is required to delete the statefulset that spawns the DG pods
In this matter, any section that is only relevant for type Cache will be ignored, for instance service.replicationFactor. Meaning the service type can be modified after it was created.
Otherwise as an alternative, new custom resource needs to be created with the service.type.DataGrid - and then the data must be migrated.

objects:
- apiVersion: infinispan.org/v1
  kind: Infinispan
  metadata:
    name: ${CLUSTER_NAME}
    namespace: ${CLUSTER_NAMESPACE}
    annotations:
      infinispan.org/monitoring: 'false'
    labels:
      type: middleware
      prometheus_domain: ${CLUSTER_NAME}
  spec:
...
    service:
      type: DataGrid

Migration steps:

Given the above context, below is the streamline migration process:

  1. Set replicas to zero on the Infinispan Custom Resource
  2. Edit the service.type: Cache to DataGrid (don't worry with misspells, only those two types are accepted)
  3. Delete the statefulset: oc delete sts $sts_name (you can get the statefulset name via oc get sts)
  4. Set back the replicas on the Infinispan custom resource to 1=>
  5. Verify: the new pods should not have -Xmx200M, -Xms200M, -XX:MaxRAM=420M and instead should bring Xmx as below (Xmx is calculate as half of the container size):
21:58:31,341 INFO  (main) [BOOT] JVM OpenJDK 64-Bit Server VM Red Hat, Inc. 11.0.17+8-LTS
21:58:31,347 INFO  (main) [BOOT] JVM arguments = [-server, -Xmx512m, -XX:+ExitOnOutOfMemoryError, -XX:MetaspaceSize=32m.. <--------------- show Xmx 512mb and no MaxRAM;

WARNING - ### WARNING:

If the procedure above is not followed, the situation will be stuck the statefulset (similar to the pods) but diverging from the Infinispan CR, as described Data Grid 8 Operator pod/statefulset is stuck. See that solution for more details on this matter.

Root Cause

Avoid deprecated service type Cache

First, MaxRAM will set with the service type Cache and that's hard coded. This doesn't happen on service type DataGrid.
Secondly, Cache Service will use Serial GC for the Cache type was that the JVM is restricted to the smallest footprint possible and then cache entries are stored using off-heap memory. In fact, Serial GC will be the default for any pod that has spec.container.cpu:1 given that Default JVM GC collector on runtime container.
This is true for any pod deployed by the operator: cluster pod, config-listener, controller, and GR pod - any that has cpu:1 will use Serial GC unless overwritten. Also for OpenJDK 17 images, Serial GC might be chosen if memory of the container is 1Gi or less, in spite of the number of cpus (i.e. even with cpu==2+ the collector will be serial).

Example below with the default jvm attributes:

20:32:05,878 INFO  (main) [BOOT] JVM OpenJDK 64-Bit Server VM Red Hat, Inc. 11.0.16+8-LTS
20:32:05,885 INFO  (main) [BOOT] JVM arguments = [-server, -Dinfinispan.zero-capacity-node=false, -Xmx200M, -Xms200M, -XX:MaxRAM=420M, -Dsun.zip.disableMemoryMapping=true, -XX:+UseSerialGC, -XX:MinHeapFreeRatio=5, -XX:MaxHeapFreeRatio=10, -Xlog:gc*=info:file=/tmp/bananagc.log:time,level,tags,uptimemillis:filecount=10,filesize=1m, -Dcom.redhat.fips=false, -Dio.netty.allocator.type=unpooled, -XX:+ExitOnOutOfMemoryError, -XX:MetaspaceSize=32m, -XX:MaxMetaspaceSize=96m, -Djava.net.preferIPv4Stack=true, -Djava.awt.headless=true, -Dvisualvm.display.name=redhat-datagrid-server, -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager, -Dinfinispan.server.home.path=/opt/infinispan, -classpath, :/opt/infinispan/boot/infinispan-server-runtime-13.0.10.Final-redhat-00001-loader.jar, org.infinispan.server.loader.Loader, org.infinispan.server.Bootstrap, --bind-address=0.0.0.0, -l, /opt/infinispan/server/conf/operator/log4j.xml, -c, operator/infinispan-base.xml, -c, user/infinispan-config.yaml, -c, operator/infinispan-admin.xml]
20:32:05,885 INFO  (main) [BOOT] PID = 173

Diagnostic Steps

  1. See the custom cr and search:
  service:
    type: DataGrid ---> type: Cache
  1. replicationFactor is ignored on service.type DataGrid
  2. On migrations from Cache to DataGrid, the statefulset might need to be delete - in order for the new pods to come with the correct settings:
...
$ oc delete sts example-infinispan
statefulset.apps "example-infinispan" deleted
...
$ oc get sts
NAME                 READY   AGE
example-infinispan   1/1     34s
...
$ oc get pod
NAME                                                      READY   STATUS    RESTARTS   AGE
example-infinispan-0                                      1/1     Running   0          35s
...
$ oc logs example-infinispan-003:17:34,418 INFO  (main) [BOOT] JVM OpenJDK 64-Bit Server VM Red Hat, Inc. 11.0.17+8-LTS
03:17:34,426 INFO  (main) [BOOT] JVM arguments = [-server, -Dinfinispan.zero-capacity-node=false, -Xmx512m, -XX:+ExitOnOutOfMemoryError, -XX:MetaspaceSize=32m, -XX:MaxMetaspaceSize=96m, -Djava.net.preferIPv4Stack=true, -Djava.awt.headless=true, -Dvisualvm.display.name=redhat-datagrid-server <---- this means the type is DataGrid now
Product(s)
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.