Use OpenShift Data Foundation Disaster Recovery to Protect Virtual Machines
Overview
The choice of whether to use OpenShift Data Foundation Metro-DR or OpenShift Data Foundation Regional-DR depends upon several factors including: the network latency between data centers and your Recovery Point Objective (RPO). Protected applications may be defined using GitOps methods or by using labels on the included Kubernetes resources. Regardless of these choices, there are some best practices to follow. Additionally, there are some recommendations that apply specifically to GitOps applications and some current limitations that apply to Regional-DR.
About ACM Applications
Red Hat Advanced Cluster Management enables management of multiple OpenShift clusters including deployed applications and is required for both OpenShift Data Foundation Disaster Recovery solutions. There are two main categories of ACM applications: managed applications and discovered applications.
ACM Managed Applications
Managed Applications are defined using GitOps techniques. The ACM documentation describes two types of managed applications: Subscription and ApplicationSet. Both of these work by using an external Git repository containing YAML files which fully define the application.
When including a Virtual Machine in an ACM managed application, use a PersistentVolumeClaim paired with a VolumeImportSource to prepare a virtual machine disk. Volume Populators were recently introduced in OpenShift Virtualization and more information is available in the Containerized Data Importer (CDI) Content from github.com is not included.documentation.
Red Hat provides RHEL containerdisks that can be imported directly into OpenShift Virtualization. Search the software catalog to see which versions are available. It is recommended to use a specific version of the image rather than the latest tag in order to have consistent results. The KubeVirt community maintains containerdisks for other operating systems in a Content from quay.io is not included.Quay repository.
Note that pullMethod: node is specified in the VolumeImportSource CR in order to take advantage of the OpenShift pull secret which is required to pull container images from the Red Hat registry.
In order to apply a DR policy to an application, its resources should share a common label. The VirtualMachine, PersistentVolumeClaim, and VolumeImportSource have the label drapp: dr-vm. If the Virtual Machine depends on other resources (such as DataVolumes, Service, Routes, ConfigMaps, or Secrets) make sure that those resources are also labeled.
The following example shows how to define a virtual machine that follows these recommendations:
vm.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: dr-vm
namespace: default
labels:
drapp: dr-vm
spec:
instancetype:
name: u1.medium
preference:
name: rhel.9
running: true
template:
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: dr-vm-disk
- disk:
bus: virtio
name: cloudinitdisk
volumes:
- persistentVolumeClaim:
claimName: dr-vm-disk
name: dr-vm-disk
- cloudInitConfigDrive:
userData: |
#cloud-config
user: cloud-user
password: 0qmg-iy6s-w1dw
chpasswd:
expire: false
name: cloudinitdisk
pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dr-vm-disk
namespace: default
labels:
drapp: dr-vm
spec:
dataSourceRef:
apiGroup: cdi.kubevirt.io
kind: VolumeImportSource
name: rhel9-source
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: ocs-external-storagecluster-ceph-rbd
volumeMode: Block
source.yaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: VolumeImportSource
metadata:
name: rhel9-source
labels:
drapp: dr-vm
spec:
source:
registry:
url: "docker://registry.redhat.io/rhel9/rhel-guest-image:9.4-1175"
pullMethod: node
ACM Discovered Applications
You can also protect Virtual Machines that were not created using GitOps (eg. VMs imported using This content is not included.Red Hat Migration Toolkit for Virtualization or VMs created using the OpenShift Console). In this case labels are used to identify the resources belonging to the application. After the Virtual Machine has been created apply a common label to the following resources associated with the VM: VirtualMachine, DataVolume, PersistentVolumeClaim, Service, Route, Secret, ConfigMap. Do not label VirtualMachineInstances or Pods since these are created and managed automatically by OpenShift Virtualization.
The following example shows a Virtual Machine that is discoverable as an application using the label drapp: dr-vm
vm.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: vm-rhel-datavolume-block
namespace: acm-discovered-vm
labels:
kubevirt.io/vm: vm-rhel-datavolume
drapp: dr-vm
spec:
dataVolumeTemplates:
- metadata:
labels:
drapp: dr-vm
name: rhel-dv
spec:
source:
registry:
url: 'docker://registry.redhat.io/rhel9/rhel-guest-image:9.4-1175'
storage:
resources:
requests:
storage: 20Gi
running: false
template:
metadata:
labels:
kubevirt.io/vm: vm-rhel-datavolume
spec:
architecture: amd64
domain:
devices:
disks:
- disk:
bus: virtio
name: datavolumedisk1
machine:
type: pc-q35-rhel9.4.0
resources:
requests:
memory: 4Gi
terminationGracePeriodSeconds: 0
volumes:
- dataVolume:
name: rhel-dv
name: datavolumedisk1
Since ACM does not provision discovered applications an extra step is required to complete failover and relocate actions. As described in more detail in the OpenShift Data Foundation Documentation, the VirtualMachine and its related resources must be manually cleaned up from the evacuated cluster. The DRPlacementControl (DRPC) object will have a status of WaitOnUserToCleanUp when this step is required. Remove the VirtualMachine and all of its accompanying objects which carry the common label that was used to discover the application. This also includes PersistentVolumeClaims. When the DRPC is waiting for cleanup it is safe to delete these resources because the application has already been replicated to the target cluster.
Regional DR Limitations
Due to differences in how volume replication works, Regional DR currently has one important limitations. Volumes created from a CSI Snapshot or a CSI Clone can not be part of a protected application. These volumes are dependent on a parent volume which will not be replicated. To work around this limitation follow the above recommendation and create disks by importing data rather than cloning. This limitation will be removed in a future release.
If you want to clone from a PV, you can leverage a DRPolicy that flattens the PVs before replication as well.
Regional DR uses RBD snapshots to replicate data. VolumeGroupReplication for RBD-backed VMs has been implemented starting with ODF 4.19. With this functionality it is now possible to consistently DR-protect multi-volume VMs.
Read How for ODF 4.19
This limitation does not apply to Metro DR.
Conclusion
Enabling disaster recovery for your OpenShift application takes some advanced planning. With just a few additional considerations you can also protect virtual machines when using OpenShift Data Foundation Disaster Recovery.
* Multi-disk VMs are not supported
† Cloned disks are not supported