Migrating OpenShift Logging Operator log store from Elasticsearch to Loki in Red Hat OpenShift Container Platform 4
The following document describes how to migrate the OpenShift Logging storage service from Elasticsearch to LokiStack. The guide includes only steps on how to switch forwarding logs from Elasticsearch to LokiStack! It does not include any steps for migrating data between these two. It aims to ensure both Log Storage stacks run in parallel until the informed user can confidently shutdown Elasticsearch.
In summary, after applying the following steps:
- The old logs will still be served by Elasticsearch and visible only through Kibana.
- The new logs will be served by LokiStack and visible through the OpenShift Console logs pages (e.g. Admin->Observe->Logs)
Assumptions
- Red Hat OpenShift Logging Operator is already installed and upgraded to the latest available version for your current OpenShift cluster version that still supports the use of Elasticsearch for log storage. The absolute last version is in the OpenShift Logging 5.8.z stream
- The OpenShift Elasticsearch Operator is installed and in use by the current ClusterLogging instance
Loki Prerequisites
Storage
Loki uses different types of storage for its long-term and temporary or short-term storage needs.
Long-term storage
Long-term storage requires Loki has access to a supported object store. This includes:
- AWS S3
- Google Cloud Storage
- Azure
- Swift
- S3 Compatible (as Minio)
- OpenShift Data Foundation
The choice and preparation of an appropriate object storage must be made before attempting to install Loki. As part of the Loki installation process, secrets containing the credentials for accessing the object store should be created before creating the LokiStack instance, along with other resources such as ObjectBucketClaims and ConfigMaps depending on the object store provider chosen.
Short-term storage
Short-term storage uses standard PersistentVolumeClaims. It is configured by specifying an appropriate StorageClass to use. Block storage is preferred for performance reasons. A StorageClass that provides object storage cannot be used.
CPU and Memory Requirements
See the "Loki sizing" table in the LokiStack deployment sizing section of the official Red Hat OpenShift Logging documentation.
NOTE: By default, Loki will deploy its workloads on worker nodes unless appropriate nodeSelectors and tolerations are specified in the LokiStack instance when it is created. If your cluster has dedicated infrastructure nodes, see xxxx
Current Stack
Note: if Fluentd is the collector type, consider reading the Red Hat Knowledge Base "Migrating the log collector from Fluentd to Vector reducing the number of logs duplicated in RHOCP 4".
Assuming the current stack looks like the below that represents a fully managed OpenShift Logging stack with Log Store: Elastisearch, and Kibana including collection, forwarding, storage, and visualization.
Disclaimer: the stack might vary regarding resources/nodes/tolerations/selectors/collector type/backend storage used
apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
name: "instance"
namespace: "openshift-logging"
spec:
managementState: "Managed"
logStore:
type: "elasticsearch"
elasticsearch:
nodeCount: 3
storage:
storageClassName: gp2
size: 80Gi
resources:
requests:
memory: 16Gi
limits:
memory: 16Gi
redundancyPolicy: "SingleRedundancy"
retentionPolicy:
application:
maxAge: 24h
audit:
maxAge: 24h
infra:
maxAge: 24h
visualization:
type: "kibana"
kibana:
replicas: 1
collection:
[...]
If using ClusterLogForwarder to forward audit logs
In the case of using the Forwarding audit logs to the log store guide to forward audit logs to the default log store, there is no need to do change anything on the ClusterLogForwarder resource. The collector pods will be configured to continue sending audit logs to forward new audit logs to LokiStack, too.
Installing and configuring Loki
Step 1: Install the Loki Operator
The Loki Operator can be installed using one of two different methods:
- via the OpenShift Container Platform web console
- from the command line (CLI)
Step 2: Create the Object Store secret
Create the secret containing the details for accessing your chosen object store provider by following the appropriate section under the Loki object storage in the official Red Hat OpenShift Logging documentation
Step 3: Create the LokiStack instance
Define the following variables based on your environment, requirements and cofigurations in the previous steps:
| Variable | Description | Examples |
|---|---|---|
| LOKI_SIZE | See the "Loki sizing" table | 1x.extra-small, 1x.small, 1x.medium |
| LOKI_LONGTERM_STORAGE_TYPE | See the "Secret type quick reference" table | s3, azure, gcs, swift |
| LOKI_LONGTERM_STORAGE_SECRET | From the previous "Create the Object Store secret" step | logging-loki-aws, logging-loki-azure, logging-loki-odf, etc |
| LOKI_SHORTTERM_STORAGECLASS | StorageClass for a block or file storage provisioner in the cluster | thin, gp2, gp3, managed-premium, etc |
Substitute the variables into the basic LokiStack YAML template below:
apiVersion: loki.grafana.com/v1
kind: LokiStack
metadata:
name: logging-loki
namespace: openshift-logging
spec:
size: ${LOKI_SIZE}
storage:
schemas:
- version: v13
effectiveDate: "2022-06-01"
secret:
name: ${LOKI_LONGTERM_STORAGE_SECRET}
type: ${LOKI_LONGTERM_STORAGE_TYPE}
storageClassName: ${LOKI_SHORTTERM_STORAGECLASS}
tenants:
mode: openshift-logging
Advanced LokiStack configurations
Scheduling Loki components on Infrastructure nodes
To schedule all Loki workloads on Infrastructure nodes, the appropriate nodeSelector and tolerations must be applied. For standard "infra" nodes that have a taint node-role.kubernetes.io/infra, the following should be appended to the end of the above basic LokiStack YAML. Note that that the template section should start indented to the same level as tenants in the above YAML:
template:
compactor:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
distributor:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
gateway:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
indexGateway:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
ingester:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
querier:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
queryFrontend:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
ruler:
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
- effect: NoExecute
key: node-role.kubernetes.io/infra
operator: Exists
See the Red Hat OpenShift Logging documentation section on Loki pod placement for more details
Disconnect Elasticsearch and Kibana CRs from ClusterLogging
To ensure Elasticsearch and Kibana continue to run on the Cluster while we switch ClusterLogging from them to LokiStack/OpenShift Console, we need to disconnect the custom resources from being owned by ClusterLogging.
Step 1: Temporarily set ClusterLogging to State Unmanaged
$ oc -n openshift-logging patch clusterlogging/instance -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge
Step 2: Remove ClusterLogging OwnerReferences from Elasticsearch resource
The following command ensures that the ClusterLogging does not own the Elasticsearch resource anymore. This means updates on the ClusterLogging resource's logStore field will not be applied to the Elasticsearch resource anymore.
$ oc -n openshift-logging patch elasticsearch/elasticsearch -p '{"metadata":{"ownerReferences": []}}' --type=merge
Step 3: Remove ClusterLogging OwnerReferences from Kibana resource
The following command ensures that the ClusterLogging does not own the Kibana resource anymore. This means updates on the ClusterLogging resource's visualization field will not be applied to the Kibana resource anymore.
$ oc -n openshift-logging patch kibana/kibana -p '{"metadata":{"ownerReferences": []}}' --type=merge
Step 4: Backup Elasticsearch and Kibana resources
To ensure that no accidental deletes destroy the previous storage/visualization components namely Elasticsearch and Kibana, the following steps describe how to backup the resources (Require the small utility Content from github.com is not included.yq):
Elasticsearch:
$ oc -n openshift-logging get elasticsearch elasticsearch -o yaml \
| yq -r 'del(.status,.metadata | .resourceVersion,.uid,.generation,.creationTimestamp,.selfLink)' > /tmp/cr-elasticsearch.yaml
Kibana:
$ oc -n openshift-logging get kibana kibana -o yaml \
| yq -r 'del(.status,.metadata | .resourceVersion,.uid,.generation,.creationTimestamp,.selfLink)' > /tmp/cr-kibana.yaml
Switch ClusterLogging to LokiStack
Step 1: Switch log storage to LokiStack
The manifest will apply several changes to the ClusterLogging resource:
-
It will re-instantiate the management state to
Managedagain. -
It will switch the
logStorespec fromelasticsearchtolokistack. In turn, this will restart the collector pods to start forwarding logs tolokistackfrom now on. -
It will remove the
visualizationspec. In turn, the cluster-logging-operator will install thelogging-view-pluginthat enables observinglokistacklogs in the OpenShift Console. -
Replace the current
spec.collectionsection with the available in the running cluster$ cat << EOF |oc replace -f - apiVersion: "logging.openshift.io/v1" kind: "ClusterLogging" metadata: name: "instance" namespace: "openshift-logging" spec: managementState: "Managed" logStore: type: "lokistack" lokistack: name: logging-loki collection: <------------------ replace with the current collection configuration [...] visualization: #Keep this section as long as you need to keep Kibana. kibana: replicas: 1 type: kibana EOF
Step 2: Re-instantiate Kibana resource
Because we remove in the previous step the visualization field entirely in favor of the operator to install the OpenShift Console integration the same operator will remove the Kibana resource, too. This is unfortunately a non-critical issue as long as we have a backup of the Kibana resource. The reason is that the operator removes the Kibana resource named kibana from openshift-logging automatically without checking any owner references. This used to be correct as long as Kibana was the only supported visualization component on OpenShift Logging.
$ oc -n openshift-logging apply -f /tmp/cr-kibana.yaml
Step 3: Enable the console view plugin
In case that the console view plugin is not enabled, then, it should be to view the logs integrated from the RHOCP Console -> Observe -> Logs.
$ oc patch consoles.operator.openshift.io cluster --type=merge --patch '{ "spec": { "plugins": ["logging-view-plugin"] } }'
Delete the Elasticsearch stack
When the retention period for the log stored in the Elasticsearch logstore is expired and no more logs are visible in the Kibana instance is it possible to remove the old stack to release resources.
Step 1: Delete Elasticsearch and Kibana resources
$ oc -n openshift-logging delete kibana/kibana elasticsearch/elasticsearch
Step 2: Delete the PVCs used by the Elasticsearch instances
$ oc delete -n openshift-logging pvc -l logging-cluster=elasticsearch