Pipeline run fails in a disconnected Red Hat OpenShift AI (RHOAI) environment using the Elyra JupyterLab extension

Solution Unverified - Updated

Environment

  • Red Hat OpenShift AI (RHOAI)
    • Version: 2.8 and earlier

Issue

When running a pipeline using the Elyra JupyterLab extension in a disconnected RHOAI environment, the pipeline may fail in the following cases:

  1. If you execute pip install in a notebook cell, the step could fail due to missing SSL certificates for the pip host.
  2. If you access external storage within a notebook, the step could fail due to missing SSL certificates for your storage host.

Note: Ensure the necessary SSL certificates are available for all services. To help you identify potential issues with your SSL certificates, check the pod logs in your cluster.

pipeline failed

Resolution

When using a data science pipeline in the data science project environment to create and execute pipelines from a Jupyter notebook, as described in This content is not included.Running a pipeline in JupyterLab, you might have SSL certificate validation issues.

Resolve the Installation of the Python package via pip

Before you execute your pipeline, perform the following steps:

  1. Open JupyterLab, and confirm that the JupyterLab launcher is automatically displayed.
  2. In the Elyra section of the JupyterLab launcher, click the Pipeline Editor tile.
  3. Open your pipeline in the pipeline editor.
  4. Right-click the relevant pipeline node and select Open Properties.
  5. On the Pipeline Properties tab in the right pane, go to the Environment variables section.
  6. Set the PIP_INDEX_URL and PIP_TRUSTED_HOST environment variables to specify the location of the Python dependencies, as shown in the following example:
 PIP_INDEX_URL: (ex: https://pypi-server.apps.example.com/simple)
 PIP_TRUSTED_HOST: (ex:pypi-server.apps.example.com)
PIP enviroment vairble set

This includes these environment variables in all the pipelines.

  1. Save your changes.
  2. Execute the pipeline.

Resolving Connection Failures to Storage

  1. Build a custom runtime image with your CA certificates included.
# Multi-stage build to update CA certificates
FROM <base-image>

USER root
COPY ca-certificates.crt /etc/pki/ca-trust/source/anchors/
RUN update-ca-trust extract

# Copy updated CA certificates from the builder stage
COPY --from=builder /etc/pki/ca-trust/extracted /etc/pki/ca-trust/extracted

USER 1001  # Set non-root user for container runtime
  1. Open JupyterLab, and confirm that the JupyterLab launcher appears.
  2. In the Elyra section of the JupyterLab launcher, click Pipeline Editor.
  3. Open your pipeline in the pipeline editor.
  4. In the JupyterLab left sidebar, select Runtime Images.
    Edit Runtime Images
  5. Set your custom runtime images in the workbench runtime.
    Create Runtime Image from widget
  6. Use the custom runtime image as the base for the pipeline node.
    Set the runtime image as base
  7. In Pipeline properties, set the certificate path for the SSL_CERT_FILE environment variable:
 SSL_CERT_FILE:/etc/pki/ca-trust/extracted/ca-certificate.crt
Set SSL_CA_CERT in pipeline properties
9. Save your changes.
10. Execute your pipeline.

Resolving SSL Connection Failures with the Python requests Module

Note: The Python requests module uses OpenSSL.

  1. Set the certificate path using the REQUESTS_CA_BUNDLE environment variable.

  2. Ensure the CA certificate is included in the custom runtime images.

  3. Update pipeline properties to specify the certificate path:

    REQUESTS_CA_BUNDLE: /etc/pki/trusted-ca/ca.crt
    
Set REQUESTS_CA_BUNDLE in pipeline properties
4. Save your changes and execute the pipeline.

Reason

In disconnected clusters, self-signed CA certificates are required and must be mounted onto the pipeline runtime pods. To include these certificates, inject them into the runtime images, then configure environment variables or module parameters to enable SSL requests using these certificates.

See Also:

Root Cause

When executing a pipeline from your notebook, the pipeline launches as a separate task, creating pods. In a disconnected cluster, these pods will fail because the runtime image attempts an external network connection. Since the service uses a self-hosted CA rather than the standard trusted CA bundle, SSL certificate validation fails, causing the issue.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.