Why OpenShift Container Storage nodes are not discoverable in wizard for Local Storage deployment?
Environment
- OpenShift Container Storage(OCS) v4.6
- OpenShift Container Platform(OCP) v4.6
Issue
- OpenShift Container Storage nodes are not discoverable in wizard for Local Storage deployment.
- diskmaker-discovery, diskmaker-manager daemonset pod from LocalStorage operator not able to start on Storage nodes.
Resolution
-
Fix for this issue is now available in OCP-4.6.12 or above. In case an older OCP version is being used and the issue is observed. Follow the workaround below.
-
Workaround is to complete the OCS cluster creation by removing the taint and then adding the taint and toleration later on. Follow the steps given below:
Step 1: Untaint OCS nodes
# oc adm taint nodes --all node.ocs.openshift.io/storage-Step 2: Rediscover the nodes and complete the OCS cluster creation
* Follow the relevant deployment documentation to complete your cluster creation.Step 3: Add required toleration
-
Get the localvolumendesicoveries
# oc get localvolumediscoveries.local.storage.openshift.io -n openshift-local-storage Example Output: NAME AGE auto-discover-devices 175m -
Edit the localvolumedesicoveries spec to add toleration
# oc edit localvolumediscoveries.local.storage.openshift.io auto-discover-devices -n openshift-local-storage Add/update the toleration under spec section -----------------------Snippet-------------------- tolerations: - effect: NoSchedule key: node.ocs.openshift.io/storage operator: Equal value: "true" -
Get the localvolume sets
# oc get localvolumesets.local.storage.openshift.io -n openshift-local-storage Example output NAME STORAGECLASS PROVISIONED AGE localblock localblock 8 175m -
Edit the localvolumesets spec to add toleration
# oc edit localvolumesets.local.storage.openshift.io localblock -n openshift-local-storage Add/update the toleration under spec section -----------------------Snippet-------------------- tolerations: - effect: NoSchedule key: node.ocs.openshift.io/storage operator: Equal value: "true"
Step 4: Add back the taint to all the relevant OCS nodes
# oc adm taint node <node name> node.ocs.openshift.io/storage="true":NoSchedule Note: All the OCS nodes should be tainted. -
Root Cause
- This is a known issue as highlighted in the known issues for OCS-4.6.
- Red Hat OpenShift Container Storage v4.6 for Local Storage based deployments can now be deployed using the user interface on Openshift Container Platform v4.6. During the storage cluster creation, nodes are not discoverable if Red Hat OpenShift Container Storage nodes have the taint
node.ocs.openshift.io/storage="true":NoSchedulebecauselocalvolumesetandlocalvolumediscoverycustom resources do not have the required toleration.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.