OpenShift Data Foundation - Disabling Multus "holder" pods
Due to the recurring maintenance impact of holder pods during upgrade (holder pods are present when Multus is enabled), holder pods are deprecated in the ODF v4.16 release and targeted for removal in the ODF v4.18 release. This deprecation requires completing additional network configuration actions before removing the holder pods. In ODF v4.15, clusters with Multus enabled are upgraded to v4.16 following standard upgrade procedures. After the ODF cluster (with Multus enabled) is successfully upgraded to v4.16, administrators must then complete the procedure documented here to disable and remove holder pods. Be aware that this disabling procedure is time consuming; however, it is not critical to complete the entire process immediately after upgrading to v4.16. Clusters are allowed to migrate from 4.16 to 4.17 with holder pods if migration needs more time. It is critical to complete the process before ODF is upgraded to v4.18.
How to determine if migration is needed
If there is any confusion about whether the steps in this guide need to be followed, check list all ODF daemonsets with this command:
oc --namespace openshift-storage list daemonsets
If the returned list of daemonsets includes names that include "plugin-holder", and the ODF version is 4.16 or 4.17, this guide must be followed. The steps in this guide must be completed before upgrading to ODF 4.18 to avoid unnecessary storage service outage.
Below are some examples of "holder" daemonset names. Some ODF installation environments could result in slightly different names, so be sure to check carefully.
csi-rbdplugin-holder-ocs-storagecluster-cephcluster
csi-cephfsplugin-holder-ocs-storagecluster-cephcluster
csi-nfsplugin-holder-ocs-storagecluster-cephcluster
How to disable holder pods
Step 1
Read through the official ODF documentation Multus Prerequisites section. Use the prerequisites section to develop a plan for modifying host configurations as well as the public NetworkAttachmentDefinition.
Once the plan is developed, execute the plan by following the steps below.
Before proceeding, ensure that ODF v4.15 is fully upgraded to v4.16 (or v4.16 to v4.17) and that the cluster is healthy. This is covered in official ODF documentation here.
Step 2
First, modify the public NetworkAttachmentDefinition as needed. In order for ODF pods to route to nodes, the Whereabouts IPAM configuration must include a routes section that includes the node public network address range. The routes configuration section must be added if it doesn't yet exist.
Be careful not to make modifications to the NetworkAttachmentDefinition that change IPAM’s CIDR range given to pods. The routes directive can be safely added.
For examples, see the ODF Documentation Multus Examples section and the Example at the end of this document.
Step 3
OpenShift Container Platform recommends using the NMState operator to modify network configurations after OpenShift is installed.
Create (or update) NodeNetworkConfigurationPolicy resources to connect hosts to the Multus public network and route connections destined for Multus public network pods through the network.
For examples, see the ODF Documentation Multus Examples section and the Example at the end of this document.
Step 4
Before moving forward, verify the network connections as described in the official ODF documentation: Verifying requirements have been met.
Step 5
After the NetworkAttachmentDefinition is modified, OSD pods and MDS pods (if present) must be restarted.
The safest strategy is to restart each OSD one-by-one.
-
List all OSDs and MDSes
For example:
$ oc --namespace openshift-storage get pod --selector 'app in (rook-ceph-osd,rook-ceph-mds)' NAME READY STATUS RESTARTS AGE rook-ceph-mds-cephfilesystem-a-5947dfc5bf-kdnmc 1/1 Running 0 32h rook-ceph-mds-cephfilesystem-b-fbdd47d6d-wk9zc 1/1 Running 0 32h rook-ceph-osd-0-645bf764c-bsgrc 2/2 Running 0 32h rook-ceph-osd-1-779649994-whcxn 2/2 Running 0 32h rook-ceph-osd-2-68fc4d7b86-7flp9 2/2 Running 0 32h -
For each pod in the list, follow the below steps to restart the pod
a. Delete the pod
For example:``` $ oc --namespace openshift-storage delete pod rook-ceph-osd-0-645bf764c-bsgrc pod "rook-ceph-osd-0-645bf764c-bsgrc" deleted ```b. Wait for the deleted pod’s replacement to return to Running state
``` $ oc --namespace openshift-storage get pod --selector 'app in (rook-ceph-osd,rook-ceph-mds)' NAME READY STATUS RESTARTS AGE rook-ceph-mds-cephfilesystem-a-5947dfc5bf-kdnmc 1/1 Running 0 32h rook-ceph-mds-cephfilesystem-b-fbdd47d6d-wk9zc 1/1 Running 0 32h rook-ceph-osd-0-645bf764c-bsgrc 2/2 Running 0 18s rook-ceph-osd-1-779649994-whcxn 2/2 Running 0 32h rook-ceph-osd-2-68fc4d7b86-7flp9 2/2 Running 0 32h ```c. Repeat the above steps for each pod in the list
Step 6
Once all OSDs and MDSes have been restarted, check that the new node and NetworkAttachmentDefinition configurations are compatible. To do so, verify that each node can ping OSD pods via the public network.
Use the toolbox to list OSD IPs.
In the example below, the OSD public network is 192.168.20.0/24, so only 192.168.20.x IPs are checked.
$ ceph osd dump | grep 'osd\.'
osd.0 up in weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) [v2:192.168.20.19:6800/213587265,v1:192.168.20.19:6801/213587265] [v2:192.168.30.1:6800/213587265,v1:192.168.30.1:6801/213587265] exists,up 7ebbc19a-d45a-4b12-8fef-0f9423a59e78
osd.1 up in weight 1 up_from 24 up_thru 24 down_at 20 last_clean_interval [8,23) [v2:192.168.20.20:6800/3144257456,v1:192.168.20.20:6801/3144257456] [v2:192.168.30.2:6804/3145257456,v1:192.168.30.2:6805/3145257456] exists,up 146b27da-d605-4138-9748-65603ed0dfa5
osd.2 up in weight 1 up_from 21 up_thru 0 down_at 20 last_clean_interval [18,20) [v2:192.168.20.21:6800/1809748134,v1:192.168.20.21:6801/1809748134] [v2:192.168.30.3:6804/1810748134,v1:192.168.30.3:6805/1810748134] exists,up ff3d6592-634e-46fd-a0e4-4fe9fafc0386
Now check that each node (NODE) can reach OSDs over the public network:
$ oc debug node/NODE
# [truncated CLI output]
$ chroot /host
$ NODE $> ping -c3 192.168.20.19
# [truncated, successful output]
$ NODE $> ping -c3 192.168.20.20
# [truncated, successful output]
$ NODE $> ping -c3 192.168.20.21
# [truncated, successful output]
Do this test for all non-master nodes. If any node does not get a successful ping to a running OSD, it is not safe to proceed. Diagnose and fix the issue, then return to Step 1. A problem may arise here for many reasons. Some reasons include:
- the host may not be properly attached to the Multus public network (e.g., via macvlan)
- The host may not be properly configured to route to the pod IP range
- the public NetworkAttachmentDefinition may not be properly configured to route back to the host IP range
- the host may have a firewall rule blocking the connection in either direction
- the network switch may have a firewall or security rule blocking the connection
If all nodes get successful ping results to all OSDs, it is safe to proceed with holder pod removal.
Step 7
This begins the process of removing holder pods from the ODF cluster. Once this step is completed, it is recommended to complete the remainder of the migration process without unnecessary delays.
Use the following command to configure ODF to stop managing holder pods:
$ oc --namespace openshift-storage patch configmap rook-ceph-operator-config --type merge --patch '{"data": {"CSI_DISABLE_HOLDER_PODS": "true"}}'
Afterwards, csi-*plugin-* pods will restart, and csi-*plugin-holder-* pods will remain running.
Step 8
When CSI pods all return to Running state, check that CSI pods are using the correct host networking configuration using the example below as guidance.
$ oc -n openshift-storage get -o yaml daemonsets.apps csi-rbdplugin | grep -i hostnetwork
hostNetwork: true
$ oc -n openshift-storage get -o yaml daemonsets.apps csi-cephfsplugin | grep -i hostnetwork
hostNetwork: true
# perform below only if NFS is installed
$ oc -n openshift-storage get -o yaml daemonsets.apps csi-nfsplugin | grep -i hostnetwork
hostNetwork: true
Step 9
At this stage, PVCs for running applications are still using the holder pods. These PVCs must be migrated from the holder to the new network. Follow the below process to do so.
For each worker node in the Kubernetes cluster:
cordonanddrainthe node- Wait for all pods to drain
- Delete all
csi-*plugin-holder*pods on the node (a new holder will take its place) uncordonthe node- Wait for the node to be rehydrated and stable
- Proceed to the next node
This procedure is the same as the one documented in this support article, which goes in greater detail.
Step 10
After this process is done for all Kubernetes nodes, it is safe to delete the csi-plugin-holder daemonsets.
List and delete all the holder daemonsets using the example below as guidance.
$ oc -n openshift-storage get daemonset -o name | grep plugin-holder
daemonset.apps/csi-cephfsplugin-holder-ocs-storagecluster-cephcluster
daemonset.apps/csi-rbdplugin-holder-ocs-storagecluster-cephcluster
$ oc -n openshift-storage delete daemonset.apps/csi-cephfsplugin-holder-ocs-storagecluster-cephcluster
daemonset.apps "csi-cephfsplugin-holder-ocs-storagecluster-cephcluster" deleted
$ oc -n openshift-storage delete daemonset.apps/csi-rbdplugin-holder-ocs-storagecluster-cephcluster
daemonset.apps "csi-rbdplugin-holder-ocs-storagecluster-cephcluster" deleted
Step 11
The migration is now complete! Congratulations, and well done!
Post-migration troubleshooting
After the migration is complete, if any user applications experience hangs or errors saving data, this may be an indication of 1 of 2 things:
- MDS pod(s) may not have successfully gotten the new routing configuration.
a. To resolve this, begin by restarting MDS pods.
b. If the issue is still not resolved, try running Step 9 again for the node on which the application pod is running. - The node on which the application pod is running may not have been drained correctly.
a. To resolve this, begin by running Step 9 again for the node.
Example
For users who have installed ODF with a Multus public network using ODF’s suggested configurations, this example will show what resource updates to make during steps 1-3 of the migration process above.
Keep in mind that the interface and IP ranges need to be tailored for each organization’s deployment.
Interpreting the current configurations
A recommended Multus public NetworkAttachmentDefinition that was deployed in ODF 4.15 or earlier will look something like the following, which we will use as a basis for our example.
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: odf-public-net
namespace: openshift-storage
spec:
config: '{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "eth0",
"mode": "bridge",
"ipam": {
"type": "whereabouts",
"range": "192.168.200.0/24"
}'
Analyzing this example NetworkAttachmentDefinition, these are the important details:
- The underlying network supporting the public network is attached to hosts at eth0
- macvlan is used to attach pods to eth0
- Pods get the IP range 192.168.200.0/24
- whereabouts is used to assign IPs to the Multus public network
For this example, we will assume that nodes are not attached to the Multus public network already.
For this example, we will assume that there are 3 compute nodes that there are 3 compute nodes in the OpenShift cluster on which OpenShift Data Foundation also runs: compute-0, compute-1, and compute-2.
Planning modifications
In order to meet the new Multus Prerequisites , the NetworkAttachmentDefinition must be modified, and new NodeNetworkConfigurationPolicy configs must be added.
Before making those changes, first plan the execution. Nodes must have IP addresses on the Multus public network. Additionally, those IP addresses must not overlap with IP addresses available to pods on the Multus public network. In this example, it means that node IPs cannot include the range 192.168.200.0/24.
For this example, let’s assume that the IP range 192.168.220.0/24 is free and will be set aside for nodes to use on the Multus public network.
Whereabouts cannot be used to allocate IPs to nodes. For this example, we will also assume that a DHCP server is not available. We can use static IP management to allocate IPs to nodes on the Multus public network.
Modifying the NetworkAttachmentDefinition
This will be used for Step 2 in the instruction document above.
Based on the new network plan modified above, we can modify the NetworkAttachmentDefinition to route to nodes by adding the routes field with the config shown.
It is important that the existing configurations are not modified to ensure ODF does not experience any cluster disruption or downtime during the migration.
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: odf-public-net
namespace: openshift-storage
spec:
config: '{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "eth0",
"mode": "bridge",
"ipam": {
"type": "whereabouts",
"range": "192.168.200.0/24", # added comma to end of line
"routes": [ # added
{"dst": "192.168.220.0/24"} # added
] # added
}
}'
Creating NodeNetworkConfigurationPolicies
This will be used for Step 3 in the instruction document above.
This work largely follows the explanation and plan described in the ODF documentation example “Macvlan, Whereabouts, Node Static IPs”. The address and routes configurations must be modified to match this example, as shown below.
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: ceph-public-net-shim-compute-0
namespace: openshift-storage
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
kubernetes.io/hostname: compute-0
desiredState:
interfaces:
- name: odf-pub-shim
description: Shim interface used to connect host to OpenShift Data Foundation public Multus network
type: mac-vlan
state: up
mac-vlan:
base-iface: eth0
mode: bridge
promiscuous: true
ipv4:
enabled: true
dhcp: false
address: # must match plan (192.168.220.0/24 in this example)
- ip: 192.168.220.1 # STATIC IP FOR compute-0 in the planned range
prefix-length: 24 # planned mask
routes:
config:
- destination: 192.168.200.0/24
next-hop-interface: odf-pub-shim
---
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: ceph-public-net-shim-compute-1
namespace: openshift-storage
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
kubernetes.io/hostname: compute-1
desiredState:
interfaces:
- name: odf-pub-shim
description: Shim interface used to connect host to OpenShift Data Foundation public Multus network
type: mac-vlan
state: up
mac-vlan:
base-iface: eth0
mode: bridge
promiscuous: true
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.220.2 # STATIC IP FOR compute-1 in the planned range
prefix-length: 24 # planned mask
routes:
config:
- destination: 192.168.200.0/24
next-hop-interface: odf-pub-shim
---
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: ceph-public-net-shim-compute-2
namespace: openshift-storage
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
kubernetes.io/hostname: compute-2
desiredState:
interfaces:
- name: odf-pub-shim
description: Shim interface used to connect host to OpenShift Data Foundation public Multus network
type: mac-vlan
state: up
mac-vlan:
base-iface: eth0
mode: bridge
promiscuous: true
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.220.3 # STATIC IP FOR compute-2 in the planned range
prefix-length: 24 # planned mask
routes:
config:
- destination: 192.168.200.0/24
next-hop-interface: odf-pub-shim