How to migrate Vector checkpoints in RHOCP 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Red Hat OpenShift Logging (RHOL)
- 5
- 6
- Vector
Issue
- After upgrading Logging from v5.8 to Logging v6 duplicated logs exists in the log storage
- After upgrading Logging from v5.9 to Logging v6 duplicated logs are in the log storage
- After upgrading Logging, it's observed
429 Too Many Requestserrors and the storage size used in the Log Storage is highly increased - It's observed a big peak of cpu and memory after upgrading Logging
Resolution
Scenario 1. When migrating from Fluentd to Vector
Review the Red Hat Knowledge Article "Migrating the log collector from Fluentd to Vector reducing the number of duplicated logs in RHOCP 4".
Scenario 2. When migrating using Vector from RHOL v5.8 to v6
Follow the steps in the Red Hat Knowledge Article "How to transition the collectors and the default log store from Red Hat OpenShift Logging 5 to 6" and in the "Step 5: Delete the ClusterLogging instance and deploy the ClusterLogForwarder observability Custom Resource for Move the Vector checkpoints for the clusterLogging CR instance", execute:
- Download the file migrate_checkpoints_v58tov6_0.txt
- Open the file downloaded and adjust the variables
nsandcr. By default, it considers that the namespace isopenshift-logging(nsvariable) and theclusterLogForwarderCustom Resource (CR) is collector (crvariable) - Rename to migrate_checkpoints_v58tov6_0.sh
$ mv migrate_checkpoints_v58tov6_0.txt migrate_checkpoints_v58tov6_0.sh
- Give execution permissions:
$ chmod 755 migrate_checkpoints_v58tov6_0.sh
- Execute it:
$ ./migrate_checkpoints_v58tov6_0.sh
- The script above doesn't copy checkpoints for
input_infrastructure_containerdirectory. Use the following commands to copy frominput_application_container directorytoinput_infrastructure_containerdirectory:
$ ns="openshift-logging"
$ cr="collector"
$ for node in $(oc get nodes -o name); do echo "### $node ###"; oc debug $node -- chroot /host /bin/bash -c "cp -Ra /var/lib/vector/$ns/$cr/input_application_container/ /var/lib/vector/$ns/$cr/input_infrastructure_container/"; done
- (Option): The script automatically migrates checkpoints for the default application, infrastructure, and audit inputs. If the ClusterLogForwarder defines custom inputs (e.g., filtering specific namespaces into a named input), the script will not automatically populate the correct checkpoint directories for them. Users with custom inputs must manually populate the checkpoint directories for them before continuing with the next steps. The following are example commands to copy from default input_application_container directory to new
input_<custom_input_name>_containerdirectory:
$ ns="openshift-logging"
$ cr="collector"
$ for node in $(oc get nodes -o name); do echo "### $node ###"; oc debug $node -- chroot /host /bin/bash -c "cp -Ra /var/lib/vector/$ns/$cr/input_application_container/ /var/lib/vector/$ns/$cr/input_<custom_input_name>_container/"; done
Finished the script execution, continue with the next steps from the Red Hat Knowledge Article "How to transition the collectors and the default log store from Red Hat OpenShift Logging 5 to 6" to finish the migration.
Scenario 3. When migrating using Vector from RHOL v5.9 to v6
Follow the steps in the Red Hat Knowledge Article "How to transition the collectors and the default log store from Red Hat OpenShift Logging 5 to 6" and in the "Step 5: Delete the ClusterLogging instance and deploy the ClusterLogForwarder observability Custom Resource for Move the Vector checkpoints for the clusterLogging CR instance", execute:
$ ns="openshift-logging"
$ cr="collector"
$ for node in $(oc get nodes -o name); do oc debug $node -- chroot /host /bin/bash -c "mkdir -p /var/lib/vector/$ns/$cr" ; done
$ for node in $(oc get nodes -o name); do oc debug $node -- chroot /host /bin/bash -c "chmod -R 755 /var/lib/vector/$ns" ; done
$ for node in $(oc get nodes -o name); do echo "### $node ###"; oc debug $node -- chroot /host /bin/bash -c "cp -Ra /var/lib/vector/input* /var/lib/vector/$ns/$cr/"; done
Finished the script execution, continue with the next steps from the Red Hat Knowledge Article "How to transition the collectors and the default log store from Red Hat OpenShift Logging 5 to 6" to finish the migration.
Root Cause
Scenario 1. When migrating from Fluentd to Vector
The position files from Fluentd where it's referenced the log files opened and last position read inside the files have not a format that understable by Vector
Scenario 2. When migrating using Vector from RHOL v5.8 to v6
The Vector checkpoints path changed from the path /var/lib/vector/raw_* to /var/lib/vector/$ns/$cr/input_*.
Scenario 3. When migrating using Vector from RHOL v5.9 to v6
The Vector checkpoints path changed from the path /var/lib/vector/input to /var/lib/vector/$ns/$cr/input_*.
Note: ns=<namespace> and cr=<clusterLogForwarder Custom Resource>.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.