Upgrading OpenShift AI Self-Managed
Upgrade OpenShift AI on OpenShift
Abstract
Preface
As a cluster administrator, you can configure either automatic or manual upgrade of the OpenShift AI Operator.
Chapter 1. Overview of upgrading OpenShift AI Self-Managed
As a cluster administrator, you can configure either automatic or manual upgrades for the Red Hat OpenShift AI Operator.
For information about upgrading OpenShift AI as self-managed software on your OpenShift cluster in a disconnected environment, see Upgrading OpenShift AI Self-Managed in a disconnected environment.
Previously, data science pipelines in OpenShift AI were based on KubeFlow Pipelines v1. Data science pipelines are now based on KubeFlow Pipelines v2, which uses a different workflow engine. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
Data science pipelines 1.0 resources are no longer supported or managed by OpenShift AI 2.16. After upgrading to OpenShift AI 2.16, it is no longer possible to deploy, view, or edit the details of pipelines that are based on data science pipelines 1.0 from either the dashboard or the KFP API server. If you are a current data science pipelines user, do not upgrade to OpenShift AI 2.16 until you are ready to migrate to data science pipelines 2.0.
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. If you are upgrading to OpenShift AI 2.16, you must manually migrate your existing data science pipelines 1.0 instances and update your workbenches. For more information, see Migrating to data science pipelines 2.0.
Data science pipelines 2.0 contains an installation of Argo Workflows. OpenShift AI does not support direct customer usage of this installation of Argo Workflows. To upgrade to OpenShift AI 2.9 or later with data science pipelines, ensure that no separate installation of Argo Workflows exists on your cluster.
- If you configure automatic upgrades, when a new version of the Red Hat OpenShift AI Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator without human intervention.
If you configure manual upgrades, when a new version of the Red Hat OpenShift AI Operator is available, OLM creates an update request.
A cluster administrator must manually approve the update request to update the Operator to the new version. See Manually approving a pending Operator upgrade for more information about approving a pending Operator upgrade.
By default, the Red Hat OpenShift AI Operator follows a sequential update process. This means that if there are several minor versions between the current version and the version that you plan to upgrade to, Operator Lifecycle Manager (OLM) upgrades the Operator to each of the minor versions before it upgrades it to the final, target version. If you configure automatic upgrades, OLM automatically upgrades the Operator to the latest available version, without human intervention. If you configure manual upgrades, a cluster administrator must manually approve each sequential update between the current version and the final, target version.
For information about OpenShift AI Self-Managed release types and supported versions, see the Red Hat OpenShift AI Self-Managed Life Cycle Knowledgebase article.
- Before you upgrade OpenShift AI, you should complete the Requirements for upgrading OpenShift AI.
- Before you can use an accelerator in OpenShift AI, your instance must have the associated accelerator profile. If your OpenShift instance has an accelerator, its accelerator profile is preserved after an upgrade. For more information about accelerators, see Working with accelerators.
Notebook images are integrated into the image stream during the upgrade and subsequently appear in the OpenShift AI dashboard.
NoteNotebook images are constructed externally; they are prebuilt images that undergo quarterly changes and they do not change with every OpenShift AI upgrade.
Additional resources
Chapter 2. Configuring the upgrade strategy for OpenShift AI
As a cluster administrator, you can configure either an automatic or manual upgrade strategy for the Red Hat OpenShift AI Operator.
By default, the Red Hat OpenShift AI Operator follows a sequential update process. This means that if there are several versions between the current version and the version that you intend to upgrade to, Operator Lifecycle Manager (OLM) upgrades the Operator to each of the intermediate versions before it upgrades it to the final, target version. If you configure automatic upgrades, OLM automatically upgrades the Operator to the latest available version, without human intervention. If you configure manual upgrades, a cluster administrator must manually approve each sequential update between the current version and the final, target version.
For information about supported versions, see the Red Hat OpenShift AI Self-Managed Life Cycle Knowledgebase article.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- The Red Hat OpenShift AI Operator is installed.
Procedure
- Log in to the OpenShift cluster web console as a cluster administrator.
- In the Administrator perspective, in the left menu, select Operators → Installed Operators.
- Click the Red Hat OpenShift AI Operator.
- Click the Subscription tab.
Under Update approval, click the pencil icon and select one of the following update strategies:
-
Automatic: New updates are installed as soon as they become available. -
Manual: A cluster administrator must approve any new update before installation begins.
-
- Click Save.
Additional resources
- For more information about the subscription channels that are available in version 2 of the Red Hat OpenShift AI Operator, see Installing the Red Hat OpenShift AI Operator.
- For more information about upgrading Operators that have been installed by using OLM, see Updating installed Operators in the OpenShift documentation.
Chapter 3. Requirements for upgrading OpenShift AI
This section describes the tasks that you should complete when upgrading OpenShift AI.
Check the components in the DataScienceCluster object
When you upgrade Red Hat OpenShift AI, the upgrade process automatically uses the values from the previous DataScienceCluster object.
After the upgrade, you should inspect the DataScienceCluster object and optionally update the status of any components as described in Updating the installation status of Red Hat OpenShift AI components by using the web console.
New components are not automatically added to the DataScienceCluster object during upgrade. If you want to use a new component, you must manually edit the DataScienceCluster object to add the component entry.
Upgrading to data science pipelines 2.0
Previously, data science pipelines in OpenShift AI were based on KubeFlow Pipelines v1. Starting with OpenShift AI 2.9, data science pipelines are based on KubeFlow Pipelines v2, which uses a different workflow engine. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
Starting with OpenShift AI 2.16, data science pipelines 1.0 resources are no longer supported or managed by OpenShift AI. It is no longer possible to deploy, view, or edit the details of pipelines that are based on data science pipelines 1.0 from either the dashboard or the KFP API server.
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. If you are upgrading to OpenShift AI 2.16, you must manually migrate your existing data science pipelines 1.0 instances. For more information, see Migrating to data science pipelines 2.0.
Data science pipelines 2.0 contains an installation of Argo Workflows. OpenShift AI does not support direct usage of this installation of Argo Workflows.
If you upgrade to OpenShift AI with data science pipelines 2.0 and an Argo Workflows installation that is not installed by data science pipelines exists on your cluster, OpenShift AI components will not be upgraded. To complete the component upgrade, disable data science pipelines or remove the separate installation of Argo Workflows. The component upgrade will complete automatically.
Address KServe requirements
For the KServe component, which is used by the single-model serving platform to serve large models, you must meet the following requirements:
- To fully install and use KServe, you must also install Operators for Red Hat OpenShift Serverless and Red Hat OpenShift Service Mesh and perform additional configuration. For more information, see Serving large models.
-
If you want to add an authorization provider for the single-model serving platform, you must install the
Red Hat - AuthorinoOperator. For information, see Adding an authorization provider for the single-model serving platform. -
If you have not enabled the KServe component (that is, you set the value of the
managementStatefield toRemovedin theDataScienceClusterobject), you must also disable the dependent Service Mesh component to avoid errors. See Disabling KServe dependencies.
Chapter 4. Updating the installation status of Red Hat OpenShift AI components by using the web console
You can use the OpenShift web console to update the installation status of components of Red Hat OpenShift AI on your OpenShift cluster.
If you upgraded OpenShift AI, the upgrade process automatically used the values of the previous version’s DataScienceCluster object. New components are not automatically added to the DataScienceCluster object.
After upgrading OpenShift AI:
-
Inspect the default
DataScienceClusterobject to check and optionally update themanagementStatestatus of the existing components. -
Add any new components to the
DataScienceClusterobject.
Prerequisites
- The Red Hat OpenShift AI Operator is installed on your OpenShift cluster.
- You have cluster administrator privileges for your OpenShift cluster.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
- In the web console, click Operators → Installed Operators and then click the Red Hat OpenShift AI Operator.
- Click the Data Science Cluster tab.
-
On the DataScienceClusters page, click the
defaultobject. Click the YAML tab.
An embedded YAML editor opens showing the default custom resource (CR) for the
DataScienceClusterobject, similar to the following example:apiVersion: datasciencecluster.opendatahub.io/v1 kind: DataScienceCluster metadata: name: default-dsc spec: components: codeflare: managementState: Removed dashboard: managementState: Removed datasciencepipelines: managementState: Removed kserve: managementState: Removed kueue: managementState: Removed modelmeshserving: managementState: Removed ray: managementState: Removed trainingoperator: managementState: Removed trustyai: managementState: Removed workbenches: managementState: RemovedIn the
spec.componentssection of the CR, for each OpenShift AI component shown, set the value of themanagementStatefield to eitherManagedorRemoved. These values are defined as follows:- Managed
- The Operator actively manages the component, installs it, and tries to keep it active. The Operator will upgrade the component only if it is safe to do so.
- Removed
- The Operator actively manages the component but does not install it. If the component is already installed, the Operator will try to remove it.
Important- To learn how to install the KServe component, which is used by the single-model serving platform to serve large models, see Installing the single-model serving platform.
-
If you have not enabled the KServe component (that is, you set the value of the
managementStatefield toRemoved), you must also disable the dependent Service Mesh component to avoid errors. See Disabling KServe dependencies. - To learn how to install the distributed workloads feature, see Installing the distributed workloads components.
Click Save.
For any components that you updated, OpenShift AI initiates a rollout that affects all pods to use the updated image.
Verification
Confirm that there is a running pod for each component:
- In the OpenShift web console, click Workloads → Pods.
-
In the Project list at the top of the page, select
redhat-ods-applications. - In the applications namespace, confirm that there are running pods for each of the OpenShift AI components that you installed.
Confirm the status of all installed components:
- In the OpenShift web console, click Operators → Installed Operators.
- Click the Red Hat OpenShift AI Operator.
-
Click the Data Science Cluster tab and select the
DataScienceClusterobject calleddefault-dsc. - Select the YAML tab.
In the
installedComponentssection, confirm that the components you installed have a status value oftrue.NoteIf a component shows with the
component-name: {}format in thespec.componentssection of the CR, the component is not installed.
Chapter 5. Adding a CA bundle after upgrading
Red Hat OpenShift AI 2.16 provides support for using self-signed certificates. If you have upgraded from OpenShift AI 2.7 or earlier versions, you can add self-signed certificates to the OpenShift AI deployments and Data Science Projects in your cluster.
There are two ways to add a Certificate Authority (CA) bundle to OpenShift AI. You can use one or both of these methods:
-
For OpenShift clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle (
ca-bundle.crt) and use the CA bundle in Red Hat OpenShift AI. -
You can use self-signed certificates in a custom CA bundle (
odh-ca-bundle.crt) that is separate from the cluster-wide bundle.
For more information, see Working with certificates.
Prerequisites
-
You have admin access to the
DSCInitializationresources in the OpenShift cluster. -
You installed the OpenShift command line interface (
oc) as described in Installing the OpenShift CLI. - You upgraded Red Hat OpenShift AI from version 2.7 or earlier. If you are working in a new installation of Red Hat OpenShift AI, see Adding a CA bundle.
Procedure
- Log in to the OpenShift as a cluster administrator.
- Click Operators → Installed Operators and then click the Red Hat OpenShift AI Operator.
- Click the DSC Initialization tab.
- Click the default-dsci object.
- Click the YAML tab.
Add the following to the
specsection, setting themanagementStatefield toManaged:spec: trustedCABundle: managementState: Managed customCABundle: ""- If you want to use self-signed certificates added to a cluster-wide CA bundle, log in to the OpenShift as a cluster administrator and follow the steps as described in Configuring the cluster-wide proxy during installation.
If you want to use self-signed certificates in a custom CA bundle that is separate from the cluster-wide bundle, follow these steps:
Add the custom certificate to the
customCABundlefield of thedefault-dsciobject, as shown in the following example:spec: trustedCABundle: managementState: Managed customCABundle: | -----BEGIN CERTIFICATE----- examplebundle123 -----END CERTIFICATE-----Click Save.
The Red Hat OpenShift AI Operator creates an
odh-trusted-ca-bundleConfigMap containing the certificates in all new and existing non-reserved namespaces.
Verification
If you are using a cluster-wide CA bundle, run the following command to verify that all non-reserved namespaces contain the
odh-trusted-ca-bundleConfigMap:$ oc get configmaps --all-namespaces -l app.kubernetes.io/part-of=opendatahub-operator | grep odh-trusted-ca-bundle
If you are using a custom CA bundle, run the following command to verify that a non-reserved namespace contains the
odh-trusted-ca-bundleConfigMap and that the ConfigMap contains yourcustomCABundlevalue. In the following command, example-namespace is the non-reserved namespace and examplebundle123 is the customCABundle value.$ oc get configmap odh-trusted-ca-bundle -n example-namespace -o yaml | grep examplebundle123