Troubleshooting Guide: SAP Edge Integration Cell on OpenShift

Updated 26 Mar 2026

Part of: SAP Edge Integration Cell on Red Hat OpenShift

ELM Deployment Issues

Issue: Image Replication Fails with Manifest Format Error - Quay Registry Compatibility

Symptom:

Action [Create Image Replications (...)] failed: Failed to create image replication - status [ERROR]
error message: cannot copy image <image>: Uploading manifest failed, attempted the following formats:
- application/vnd.oci.image.manifest.v1+json (manifest invalid)
- application/vnd.docker.distribution.manifest.v2+json (Unknown media type during manifest conversion: "application/vnd.docker.image.rootfs.diff.tar.gzip")
- application/vnd.docker.distribution.manifest.v1+prettyjws (Unknown media type during manifest conversion: "application/vnd.docker.image.rootfs.diff.tar.gzip")
- application/vnd.oci.image.index.v1+json (Unsupported conversion type)
- application/vnd.docker.distribution.manifest.list.v2+json (Unsupported conversion type)

Environment:

EIC version 8.36.2
OpenShift 4.20
Local Quay container registry (version < 3.12)
Source registry:

Root Cause: Quay registry versions below 3.12 cannot handle certain manifest formats used by the source SAP container images, causing manifest conversion failures during image replication.

Solution:

Check your current Quay version:

   # If using Quay Operator
   oc get quayregistry -n quay-enterprise -o yaml | grep "desiredVersion\|currentVersion"

   # Or check Quay pod logs for version info
   oc logs -n quay-enterprise deployment/quay-registry-quay-app | grep -i version

Upgrade Quay to version 3.12 or higher:

For Quay Operator-managed deployments:

# Update the QuayRegistry resource to specify newer version
oc patch quayregistry example-registry -n quay-enterprise --type merge -p '{"spec":{"configBundleSecret":"quay-config-bundle","components":[{"kind":"quay","managed":true}],"desiredVersion":"3.12.0"}}'

For standalone Quay deployments:

Follow Red Hat Quay upgrade documentation
Ensure backup of registry data before upgrade
Plan for potential downtime during upgrade

Verify the upgrade:

   # Check that all Quay pods are running with new version
   oc get pods -n quay-enterprise

   # Verify Quay UI shows correct version
   # Access Quay web interface and check version in footer

Retry the ELM deployment:
- Once Quay is upgraded to 3.12+, retry the failed deployment step
- The image replication should now succeed

Alternative Workarounds (if immediate upgrade not possible):

Manual image copy with skopeo:

# Install skopeo if not available
# Use skopeo to copy images with format conversion
skopeo copy --format v2s2 \
  docker://dockersrv.cdn.repositories.cloud.sap/com.sap.it.img/edge-ssb-operator:1.10.0 \
  docker://your-quay-registry.com/namespace/edge-ssb-operator:1.10.0

Prevention:

Always verify container registry compatibility before ELM deployment
Maintain Quay registry at version 3.12 or higher
Test image replication in non-production environment first

Validation:

# Verify successful image replication
oc get imagereplication -n edgelm

# Check that images are available in target registry
# Through Quay UI or API: GET /api/v1/repository/{namespace}/{repository}/tag/

Related Issues:

Any OCI/Docker manifest compatibility issues with container registries
Registry upgrade planning and procedures

Date Added: 2025-10-29

Resolution Notes: This issue was initially suspected to be a problem with source image build process, but investigation confirmed it was purely a registry compatibility issue. Upgrading Quay to version 3.12+ completely resolved the problem.

Issue: Solace Pod Fails to Start with NetApp NFS Storage Backend

Symptom:

# Solace pod stuck in pending or error state
oc get pods -n edge-icell-services | grep solace
solace-message-broker-xxx   0/1     Error    0          5m

# Pod events show storage-related errors
oc describe pod solace-message-broker-xxx -n edge-icell-services
Events:
  Warning  FailedMount  persistentvolume-controller  MountVolume.SetUp failed for volume "pvc-xxx" : mount failed: exit status 1
  Warning  FailedMount  kubelet  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data token]: timed out waiting for the condition

Environment:

SAP Edge Integration Cell deployment
NetApp NFS as storage backend for PVCs
OpenShift cluster with NFS-based StorageClass configured as default

Root Cause: The Solace message broker component within Edge Integration Cell does not support NFS storage due to performance and file locking requirements. NFS lacks the I/O characteristics and file system semantics required for Solace's persistent storage operations.

Solution:

Configure alternative storage backend (Recommended):

Option A: Use iSCSI with NetApp ONTAP:

# Install NetApp Trident operator with iSCSI backend
# Example backend configuration for iSCSI
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: backend-ontap-san
  namespace: trident
spec:
  version: 1
  storageDriverName: ontap-san
  managementLIF: "192.168.1.100"
  svm: "svm_iscsi"
  username: "admin"
  password: "password"
  protocol: "iscsi"

Option B: Use NVMe with NetApp ONTAP (if supported):

# Configure NVMe/TCP or NVMe/FC backend
# Ensure OpenShift nodes have NVMe tools installed
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: backend-ontap-nvme
  namespace: trident
spec:
  version: 1
  storageDriverName: ontap-san
  managementLIF: "192.168.1.100"
  svm: "svm_nvme"
  username: "admin"
  password: "password"
  protocol: "nvme"

Create appropriate StorageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: netapp-block-storage
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.trident.netapp.io
parameters:
  backendType: "ontap-san"
  fsType: "ext4"
allowVolumeExpansion: true
volumeBindingMode: Immediate

Update Edge Integration Cell configuration:

# Specify the block storage class in EIC deployment configuration
# This may require updating the EIC configuration in ELM UI
# or modifying the deployment manifests to use the correct StorageClass

Verify storage configuration:

   # Check available storage classes
   oc get storageclass

   # Verify the block storage class is available
   oc describe storageclass netapp-block-storage

   # Check Trident backend status
   oc get tridentbackendconfig -n trident

Alternative Workarounds:

Use local storage (for testing/development only):

# Configure local storage class for non-production environments
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Use other supported storage providers:
- Red Hat OpenShift Data Foundation (ODF)
- VMware vSphere CSI
- Amazon EBS (for AWS deployments)
- Azure Disk (for Azure deployments)

Prevention:

Always use block storage for Edge Integration Cell deployments
Review storage requirements in SAP Edge Integration Cell documentation before deployment
Test storage performance with Solace requirements in non-production environment
Follow Red Hat's storage recommendations for OpenShift workloads

Validation:

# Verify Solace pods are running successfully
oc get pods -n edge-icell-services | grep solace

# Check PVC is bound to block storage
oc get pvc -n edge-icell-services
oc describe pvc <solace-pvc-name> -n edge-icell-services | grep -A 5 "StorageClass"

# Verify storage backend type
oc get pv <pv-name> -o yaml | grep -A 10 "csi:"

Related Issues:

Performance issues with database workloads on NFS storage
File locking problems with message broker components
Storage class configuration and selection

References:

Red Hat Article: How to configure NetApp block storage with Trident for OpenShift
SAP Edge Integration Cell Storage Requirements Documentation
NetApp Trident Documentation for OpenShift

Date Added: 2025-10-29
Reporter: SAP Edge Integration Team

Resolution Notes: Using NFS storage is NOT recommended for configuring the Message Service Storage Class when deploying SAP Edge Integration Cell. The Solace component within Edge Integration Cell specifically advises against NFS usage due to performance and file system requirements. Block storage backends such as iSCSI or NVMe should be used instead.

Issue: "Permission Denied" on ODF CephFS RWX Shared Volumes

Symptom: SAP EIC pods (edge-api, worker, edc, etc.) sharing a CephFS RWX volume receive "Permission Denied" errors when accessing /mnt/diagnostics/ and /mnt/dumps/. This causes the Diagnostic Task feature to fail.

Environment: OCP 4.x with ODF CephFS, pods using privileged SCC with seLinuxContext: RunAsAny

Root Cause: SELinux MCS label conflicts when multiple pods share a CephFS RWX volume. The CephFS CSI driver relabels the volume with each pod's MCS label during mount, causing access conflicts.

Solution: See Red Hat KB 7137220 for detailed resolution steps including:

Creating a StorageClass with kernelMountOptions (new deployments)
Patching existing PersistentVolumes (existing deployments)

Date Added: 2026-01-29

General Troubleshooting Procedures

Debug Information Collection

When encountering issues, collect this information:

# Cluster information
oc version
oc get nodes
oc get clusterversion

# Operator status
oc get csv -n openshift-operators

# Service Mesh 3.x status (OSSM3)
# OSSM3 uses Istio custom resources instead of SMCP/SMMR (OSSM2).
oc get istio -n istio-system
oc describe istio/default -n istio-system
oc get pods -n istio-system
oc get pods -n istio-cni 2>/dev/null || true

# Application namespaces
oc get namespaces | grep -E "(edgelm|edge-icell)"
oc get pods --all-namespaces | grep -E "(edgelm|edge-icell)"

# RBAC status
oc get clusterroles | grep edgelm
oc get clusterrolebindings | grep edgelm
oc get serviceaccounts -n edgelm

# Events (last 1 hour)
oc get events --sort-by='.lastTimestamp' | head -20

Log Collection

# Service Mesh operator logs (OSSM3)
# Operator pod/deployment names differ by installation. Find the pod first, then fetch logs:
oc get pods -n openshift-operators | grep -i "servicemesh\\|istio"
oc logs -n openshift-operators <operator-pod-name> --tail=100

# Control plane logs
oc get pods -n istio-system | grep -i istiod
oc logs -n istio-system <istiod-pod-name> --tail=100

# Application logs (if pods exist)
oc logs -n edgelm --selector app=edgelm --tail=50

Validation and Testing Procedures

Complete Setup Validation

Run this comprehensive check after completing the setup:

#!/bin/bash
echo "=== Namespace Validation ==="
oc get namespaces | grep -E "(edgelm|edge-icell|istio-)"

echo "=== Service Mesh Validation ==="
oc get istio -n istio-system
oc get pods -n istio-system --no-headers | wc -l

echo "=== RBAC Validation ==="
oc get clusterroles | grep edgelm | wc -l
oc get serviceaccount edgelm -n edgelm

echo "=== Authentication Validation ==="
oc --kubeconfig=edgelm-kubeconfig auth can-i list pods -n edgelm

echo "=== Overall Health Check ==="
oc get pods --all-namespaces | grep -E "(Error|CrashLoop|Pending)"

Expected outputs:

6 namespaces created
Istio control plane resource exists and is "Ready" (or reports healthy status)
Multiple istio-system pods running
Several edgelm clusterroles found
Service account exists
Authentication test returns "yes"
No pods in error states

Category

Supportability

Article Type

General