vSphere Problem Detector Operator fails with permissions checks in OpenShift 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform
    • 4
  • VMware vSphere
  • vSphere Problem Detector Operator

Issue

  • Frequently receiving health checks warning for CheckAccountPermissions, CheckComputeClusterPermissions, CheckFolderPermissions and CheckDefaultDatastore.

  • In the logs of the vsphere-problem-detector-operator pod, in namespace openshift-cluster-storage-operator, messages like the following ones can be seen:

    CheckAccountPermissions failed: missing privileges for vcenter: Cns.Searchable, Sessions.ValidateSession, StorageProfile.Update, StorageProfile.View
    failed to run checks: missing privileges for vcenter: Cns.Searchable, Sessions.ValidateSession, StorageProfile.Update, StorageProfile.View
    
    error getting datastore DXX: failed to access datastore DXX: datastore 'DXX' not found
    CheckDefaultDatastore failed: defaultDatastore "DXX" in vSphere configuration: failed to access datastore DXX: datastore 'DXX' not found
    CheckFolderPermissions failed: failed to access DXXXX: failed to access DXXXX: datastore 'DXX' not found
    

Resolution

Review the required permissions for the vCenter account, and add proper permissions to the vCenter account:

If the proper permissions were added to the vCenter account, but it still occur the health checks warning, please Restart the vSphere Problem Detector Operator.

Root Cause

The alerts and warnings are triggered when there are lack of required permission for vCenter account for datastore, subfolders or other resources.

Diagnostic Steps

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

  • Check the logs of vsphere-problem-detector-operator pod underopenshift-cluster-storage-operator namespace:

    $ oc get pods -n openshift-cluster-storage-operator
    [...]
    $ oc logs -n openshift-cluster-storage-operator [vsphere-problem-detector-operator-pod_name]
    
  • Check the events in the openshift-cluster-storage-operator namespace (for OpenShift 4.13 and older, use oc get events -n openshift-cluster-storage-operator --sort-by="lastTimestamp" command instead):

    $ oc events -n openshift-cluster-storage-operator
    
  • Verify if mentioned datastore and subfolder exists on VMware vSphere.

  • Confirm existing credentials used to connect to vCenter. To get the user:

        $ oc get cm cloud-provider-config -n openshift-config -o yaml  | grep -v ^$
        apiVersion: v1
        data:
          config: '[Global]
            secret-name      = vsphere-creds
            secret-namespace = kube-system
            insecure-flag    = 1
            [Workspace]
            server            = <IP>
            datacenter        = <DC>
            default-datastore = <Default DataStore>
            folder            = <Folder>
            [VirtualCenter "<Name>"]
            datacenters = <DC>
            '
        kind: ConfigMap
    
        $ oc extract secret/vsphere-creds -n kube-system
    
  • Confirm the user's role by using Content from github.com is not included.govc command line (please note govc is provided by VMware and for any issues using the govc tool please contact VMware for detailed support):

    govc permissions.ls
    Role                          Entity  Principal                                                             Propagate
    VirtualMachinePowerUser       /       <DOMAIN>\user1                                                      Yes
    VirtualMachinePowerUser       /       <DOMAIN>\user2                                                      Yes
    ReadOnly                      /       VSPHERE.LOCAL\appuser1                                                 Yes
    Admin                         /       VSPHERE.LOCAL\devopsadmin                                             Yes
    OCPAdmin                    /       VSPHERE.LOCAL\ocpadmin                                              Yes
    VirtualMachinePowerUser       /       <DOMAIN>\BSSOPER                                                       Yes
    Admin                         /       VSPHERE.LOCAL\vpxd-extension-UUID     Yes
    Admin                         /       VSPHERE.LOCAL\vpxd-UUID               Yes
    
  • In this case the customer is using ocpadmin user and its role is OCPAdmin. We need to confirm the OCPAdmin role's permission

    $ govc role.ls OCPAdmin
    Datastore.FileManagement
    System.Anonymous
    System.Read
    System.View
    VirtualMachine.Config.AddExistingDisk
    VirtualMachine.Config.AddNewDisk
    VirtualMachine.Config.AddRemoveDevice
    VirtualMachine.Config.RemoveDisk
    
  • Compare the permissions above with the vCenter requirements and vSphere Problem Detector Operator docs linked in the "Resolution" section.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.