Resizing disks or change instance type on Azure IPI control plane nodes in RHOCP 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Control Plane
- Azure IPI
Issue
- How to increase or reduce the disk size of control plane nodes on Azure IPI.
- Can the instance type of the control plane nodes in Azure IPI can be changed?
Resolution
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.
Here is a comprehensive example illustrating the process. In this scenario, a cluster comprises three control plane and some worker nodes. The emphasis in this example is on updating the control plane nodes.
Note: The recommended disk size for OpenShift control plane nodes in Azure is
1024 GBas explained in Why is the minimum recommended size of disk for control plane nodes 1024 GB when installing OpenShift 4 on Azure? Smaller disks do not have enough performance for etcd, and specially for production environments.
IMPORTANT NOTE: this procedure relies on the
controlplanemachinesetCustom Resource, This page is not included, but the link has been rewritten to point to the nearest parent document.which was introduced in OpenShift 4.12. As explained in Missingcontrolplanemachinesetresource in IPI RHOCP 4 cluster, thecontrolplanemachinesetresource is not created by default in some combinations of OpenShift version and cloud provider (like OpenShift 4.12 and Azure). Refer to Creatingcontrolplanemachinesetin OpenShift 4.12 clusters in Azure for additional information about creating it for Azure if there is nocontrolplanemachinesetpresent in an OpenShift 4.12 cluster in Azure installed with IPI.
Prerequisites
- This procedure is specific for Azure IPI installations.
- The
occlient matches the OpenShift version currently running. It can be checked runningoc version: This page is not included, but the link has been rewritten to point to the nearest parent document.ocCLI documentation. - Prior to proceeding, it is essential to This page is not included, but the link has been rewritten to point to the nearest parent document.take a backup of etcd as a prerequisite.
- Ensure there is an
Activecontrolplanemachinesetresource in the cluster, or create it as explained in Creating controlplanemachineset in OpenShift 4.12 clusters in Azure.
Modify the control plane machines
In the IPI installation method, alterations to the control plane nodes will be enforced by the controlplanemachineset resource.
-
List the control plane
nodesandmachines:$ oc get nodes -l node-role.kubernetes.io/master= NAME STATUS ROLES AGE VERSION master-0 Ready control-plane,master 14d v1.27.6+f67aeb3 master-1 Ready control-plane,master 14d v1.27.6+f67aeb3 master-2 Ready control-plane,master 14d v1.27.6+f67aeb3 [...]$ oc get machines -n openshift-machine-api -l machine.openshift.io/cluster-api-machine-role=master NAME PHASE TYPE REGION ZONE AGE master-0 Running Standard_D8s_v3 eastus 1 2d master-1 Running Standard_D8s_v3 eastus 2 2d master-2 Running Standard_D8s_v3 eastus 3 2d -
The previous control plane nodes had a capacity of 256 gigabytes:
$ oc get machines -l "machine.openshift.io/cluster-api-machine-role=master" -n openshift-machine-api -o yaml | grep "diskSize" diskSizeGB: 256 diskSizeGB: 256 diskSizeGB: 256 -
Verify the values prior to patching with the intended size specifications.
$ oc get controlplanemachineset cluster -n openshift-machine-api -o yaml | grep "diskSizeGB:" diskSizeGB: 1024 -
In this example, the nodes have been updated with a disk size of 1024:
$ oc patch controlplanemachineset cluster -n openshift-machine-api --type merge \ -p '{"spec":{"template":{"machines_v1beta1_machine_openshift_io":{"spec":{"providerSpec":{"value":{"osDisk":{"diskSizeGB":1024}}}}}}}}' controlplanemachineset.machine.openshift.io/cluster patchedIt is also possible to change the
vmSizewith a different instance type (usually with a more performant one for the control plane).
>Note: Some instance types may require additional features, like for exampleDsv5family requires "Accelerated Networking". Check the requirements if new instance types are going to be used, and if it is needed to configure anything else in thecontrolplanemachineset). For the specific case of "Accelerated Networking", refer to Accelerated Networking for Microsoft Azure VMs. -
After patching, it will require some time to update the node using a roll-out approach:
$ watch -n 10 "oc get nodes -l node-role.kubernetes.io/master= && oc get co control-plane-machine-set" Every 10s: oc get nodes -l node-role.kubernetes.io/master= && oc get co control-plane-machine-set NAME STATUS ROLES AGE VERSION master-2hpwl-2 Ready control-plane,master 11m v1.27.6+f67aeb3 master-fxmsn-2 Ready,SchedulingDisabled control-plane,master 103m v1.27.6+f67aeb3 master-ph4s7-0 Ready control-plane,master 58m v1.27.6+f67aeb3 master-rlch2-1 Ready control-plane,master 34m v1.27.6+f67aeb3 NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE control-plane-machine-set 4.12.31 True True False 14d Waiting for 1 old replica(s) to be removed NAME DESIRED CURRENT READY UPDATED UNAVAILABLE STATE AGE cluster 3 3 3 2 Active 17d -
If the
vmSizefield is changed with a different instance type, it's possible to see it with the following command:$ oc get machines -n openshift-machine-api -l machine.openshift.io/cluster-api-machine-role=master NAME PHASE TYPE REGION ZONE AGE master-2hpwl-2 Running Standard_D16s_v3 eastus 3 19m master-fxmsn-2 Deleting Standard_D8s_v3 eastus 3 111m master-ph4s7-0 Running Standard_D16s_v3 eastus 1 66m master-rlch2-1 Running Standard_D16s_v3 eastus 2 43m -
The following command can be used to verify the changes:
$ oc get machines -l "machine.openshift.io/cluster-api-machine-role=master" -n openshift-machine-api -o yaml | grep "diskSize\|vmSize" diskSizeGB: 1024 vmSize: Standard_D16s_v3 diskSizeGB: 1024 vmSize: Standard_D16s_v3 diskSizeGB: 1024 vmSize: Standard_D16s_v3
Important: This procedure provides guidance on altering disk sizes or instance type to increase or decrease capacity. Exercise caution and refer to Managing control plane machines with control plane machine sets for further information and responsible implementation.
Root Cause
It is possible to resize the disk size of the control plane nodes in Azure without replacing all of them using the controlplanemachineset, which is available starting with OpenShift 4.12.
Note: The recommended disk size for OpenShift control plane nodes in Azure is
1024 GBas explained in Why is the minimum recommended size of disk for control plane nodes 1024 GB when installing OpenShift 4 on Azure? Smaller disks do not have enough performance for etcd, and specially for production environments.
Diagnostic Steps
The following commands are applicable for inspecting the current disk size and the instance type:
$ oc get machines -l "machine.openshift.io/cluster-api-machine-role=master" -n openshift-machine-api -o yaml | grep "diskSize\|vmSize"
diskSizeGB: 1024
vmSize: Standard_D16s_v3
diskSizeGB: 1024
vmSize: Standard_D16s_v3
diskSizeGB: 1024
vmSize: Standard_D16s_v3
For checking the disk size directly in the nodes, the following can be used (note that the disk device can change in different nodes):
$ for i in $(oc get nodes -l node-role.kubernetes.io/master= --no-headers | awk '{ print $1 }'); do
echo -e "\n\n\t\tNode: $i\t\t\n"
oc debug node/$i -- chroot /host lsblk -b
echo "---------------------------------------------"
done
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.