Install Red Hat AI Enterprise using Azure Marketplace
Run Red Hat AI Enterprise (RHAIE) through the Microsoft Azure Marketplace to create self-managed RHAIE cluster deployments. Your deployments are billed on a pay-per-use basis with your Azure subscription but are still supported by Red Hat directly.
RHAIE provides a custom environment for developing and deploying AI-driven applications and includes these products:
- Red Hat OpenShift AI (RHOAI)
- Self-managed Red Hat OpenShift Container Platform (OCP)
- AI Accelerator Entitlements
How the installation process works
Installing RHAIE on Azure has two parts:
- Set up an OpenShift Container Platform cluster using Azure Marketplace
- Install the Red Hat OpenShift AI Operator
Deployments of Red Hat OpenShift Container Platform (OCP) on Azure Marketplace are similar to self-managed installations, and you should already have experience in the Azure environment. But installing RHAIE requires a customized installation to enable billing integration.
Additional resources
Supported and unsupported installations
Both installer-provisioned (IPI) and user-provisioned infrastructure (UPI) scenarios are supported.
Red Hat AI Enterprise on Azure Marketplace does not support the following scenarios:
- Single-node deployments. These are not supported for Azure Marketplace billing and are not supported as an RHOAI production topology.
- Three-node deployments. Compact clusters are not supported for Azure Marketplace billing, and RHOAI requires dedicated worker capacity.
- Disconnected or air-gapped clusters. For a disconnected RHOAI install, see the official disconnected installation guide.
Set up an OpenShift Container Platform cluster using Azure Marketplace
The OpenShift Container platform cluster provides the infrastructure foundation for your Red Hat AI Enterprise deployment.
Prerequisites
-
For the cluster to use Azure Marketplace images, you must have the following utilities available:
- Content from learn.microsoft.com is not included.Azure CLI client (az).
- Python 3.13 or earlier, required to run the Azure CLI. If you have a later version of Python, run the Azure CLI client in a virtual environment with Python 3.13.
- This content is not included.OpenShift installation program (
openshift-install), same version as the Azure Marketplace image that you are installing. - OpenShift CLI (
oc).
-
You should have experience in the Azure environment.
-
You must be logged into the Azure CLI.
Select the OpenShift Marketplace image in Azure
Use the Azure CLI to find and select the OpenShift Marketplace image for your deployment. Selecting the correct base image is critical for performance and compatibility with your hardware.
Prerequisites
- The correct base image for the OCP minor release that you are targeting (see Red Hat OpenShift AI: Supported Configurations for 3.x)
Procedure
- In the Azure CLI, display all available OpenShift images.
$ az vm image list --all --offer rh-rhaie --publisher redhat --output table
- Select the image version for the OCP minor release that you are targeting and use it consistently throughout the installation.
Example Red Hat Core Operating System (RHCOS) image
$ az vm image show --urn redhat-rhel:rh-rhaie:rh-rhaie-3-gen2:latest
Note:
The SKUs used in this example are for Generation 2 VM images. The default instance types used in OpenShift are Gen2-compatible. To optimize performance and compatibility, use Gen2 images with GPU-capable instance types. Do not use Gen1 images.
- Review and accept the usage terms of the image using the Azure CLI.
- Review the terms for the image offering:
$ az vm image terms show --urn redhat-rhel:rh-rhaie:rh-rhaie-3-gen2:latest
- Accept the terms for this image offering:
$ az vm image terms accept --urn redhat-rhel:rh-rhaie:rh-rhaie-3-gen2:latest
Specify Marketplace Images in the installation configuration
Deploy the cluster in stages to support capacity and troubleshooting. If a network or cloud setup error occurs, you can fix it on its own without having to know which errors are caused by GPU-specific issues:
- Build the base cluster first and focus only on standard, non-GPU worker nodes:
- After the core environment is running without problems, add GPU-capable compute pools:
Create a configuration file for the base cluster with non-GPU worker nodes
Specify the Azure Marketplace image details directly in the install-config.yaml file to set up pay-per-use billing automatically. Doing this removes the need to change machine settings manually when the cluster starts.
Procedure
- To specify Marketplace images in the installconfig, edit
installconfig.compute.platform.azure.osImageto look like this sample and save your changes.
Example install-config.yaml.template file
This example assumes you do not have unconditional User Access Administrator rights, so platform.azure.defaultMachinePlatform.identity.type: None is part of the install-config.
---
apiVersion: v1
baseDomain: <your_base_domain>
compute:
- hyperthreading: Enabled
name: worker
platform:
azure:
type: Standard_D8s_v3
osImage:
publisher: redhat-rhel
offer: rh-rhaie
sku: rh-rhaie-3-gen2
version: 9.6.2026030314
replicas: 4
controlPlane:
- hyperthreading: Enabled
name: master
platform:
azure:
type: Standard_D8s_v3
osImage:
publisher: redhat-rhel
offer: rh-rhaie
sku: rh-rhaie-3-gen2
version: 9.6.2026030314
replicas: 3
metadata:
name: <your_metadata_name>
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 10.0.0.0/16
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
azure:
baseDomainResourceGroupName: <your_base_Domain_Resource_Group_Name>
region: <your_azure_region>
resourceGroupName: <your_azure_resource_group_name>
cloudName: AzurePublicCloud
defaultMachinePlatform:
identity:
type: None
publish: External
pullSecret: <'YOUR_PULL_SECRET_HERE'>
sshKey: |
<your_ssh_key>
where:
<your_base_domain>
Main domain name you own for routing traffic to the cluster, such as example.com.
<your_metadata_name>
Unique name you choose for your cluster, for example, rhaie-prod.
<your_base_Domain_Resource_Group_Name>
Name of the Azure resource group that holds the DNS settings for your domain, for example, dns-zones-rg.
<your_azure_region>
Short code for the Azure data center location where your servers are located, for example, eastus.
<your_azure_resource_group_name>
Name of a new, empty Azure resource group where your cluster servers are stored, for example, rhaie-cluster-rg.
<'YOUR_PULL_SECRET_HERE'>
Official authorization text from Red Hat that lets your cluster download software. Use single quotes around the text.
<your_ssh_key>
Security key pattern that lets you securely log in to the backend cluster nodes to fix problems, for example, ssh-rsa AAAAB3….
Create an OpenShift cluster by using this configuration file
To begin the automated deployment that creates the cluster, use the install-config.yaml.template file. OpenShift automatically manages your machines using MachineSets. These are configuration files that OpenShift uses to provision and manage virtual machines in Azure with specific hardware.
Always use a new resource group name for each installation attempt to avoid tag pollution from failed installations.
Procedure
- Create your clusters with the following command:
$ openshift-install create cluster --dir <installation_dir>
The installation might take 45 minutes or longer.
Verify that your cluster is running and stable
Before installing GPU-capable compute pools, verify that your cluster is running with the non-GPU compute nodes that use the Azure Marketplace images. These clusters are used and billed through Azure Marketplace.
Procedure
- Log in to the cluster:
- On the OpenShift console, navigate to your login ID, for example, user:admin, and click it.
- On the dropdown menu, click Copy login command.
- Click Display token.
- Log in with the token that is displayed.
Example login
$ oc login --token=sha256~AbCdEf123456XyZ789012VwXyZ345678lqJofhxgvV4 --server=https://api.cluster-xyz.example.com:6443
- Verify that the cluster is stable and healthy:
$ oc get clusterversion
$ oc get nodes
$ oc get co
If the cluster is stable, all ClusterOperators report AVAILABLE=True, PROGRESSING=False, and DEGRADED=False.
Troubleshoot cluster problems
If your installation fails or the cluster does not report a healthy status, see the following resources:
- Troubleshooting installation issues
- Content from learn.microsoft.com is not included.Azure Resource Provider Health
Create a GPU-capable compute pool for your cluster
To support RHOAI accelerated workloads, you must create a GPU-capable compute pool. To add this pool, create a new MachineSet configuration file that targets a GPU-enabled SKU, such as Standard_NC24ads_A100_v4.
Prerequisites
- You have installed a Red Hat OpenShift base cluster.
- You have verified that your Azure subscription has enough vCPU quota for the
NCSv4-seriesfamily. BecauseStandard_NC24ads_A100_v4is a newer VM type, a Generation 2 Marketplace SKU is required.
Procedure
- Identify an existing
MachineSetto use as a template for your newMachineSet.
$ oc get machinesets -n openshift-machine-api
- Export an existing
MachineSetto a YAML file to use as a template. This ensures your networking and cluster metadata are correct.
$ oc get machineset <existing_machineset_name> -n openshift-machine-api -o yaml > gpu-machineset.yaml
Where <existing_machineset_name> specifies the MachineSet to use as a base template
- Modify the YAML by updating the name, the
vmSizeto a GPU SKU, for example,Standard_NC24ads_A100_v4, and the Marketplace image details to ensure billing integration.
Example MachineSet YAML for Azure GPU Nodes
This template uses the Standard_NC24ads_A100_v4 SKU and includes the required Azure Marketplace image information for Red Hat AI Enterprise billing.
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: <cluster_id>
name: <cluster_id>-gpu-worker-<azure_region>
namespace: openshift-machine-api
spec:
replicas: 1
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: <cluster_id>
machine.openshift.io/cluster-api-machineset: <cluster_id>-gpu-worker-<azure_region>
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: <cluster_id>
machine.openshift.io/cluster-api-machine-role: worker
machine.openshift.io/cluster-api-machine-type: worker
machine.openshift.io/cluster-api-machineset: <cluster_id>-gpu-worker-<azure_region>
spec:
providerSpec:
value:
apiVersion: azureproviderconfig.openshift.io/v1beta1
kind: AzureMachineProviderSpec
location: <azure_region>
vmSize: Standard_NC24ads_A100_v4 # GPU-capable SKU
image:
publisher: redhat-rhel # Marketplace publisher
offer: rh-rhaie # Marketplace offer
sku: rh-rhaie-3-gen2 # Marketplace SKU
version: 9.6.2026030314 # Marketplace version
# Ensure other fields like networkResourceGroup and vnet match your cluster
where:
<cluster_id>
Specifies your existing cluster
<azure_region>
Specifies your cluster’s geographical region, for example, eastus (East US / Virginia)
- Create the new
MachineSetin the OpenShift CLI:
$ oc apply -f gpu-machineset.yaml
Verify that the cluster is running with the GPU compute nodes
Before installing RHOAI, verify that your cluster is running with the GPU compute nodes that use the Azure Marketplace images.
- Verify that the new machines are provisioning correctly:
$ oc get machines -n openshift-machine-api
- Confirm the new node has successfully joined the cluster:
$ oc get nodes
Your GPU-capable machine should be listed as Ready. The next step is to install RHOAI and the Operators that support it.
Install RHOAI and its dependencies
When your OpenShift cluster is running and stable, you are ready to prepare the cluster to run Red Hat OpenShift AI (RHOAI).
Prerequisites
- You have a GPU compute node installed on your cluster.
- You have the resources for RHOAI components. See the RHOAI release notes for the version that you are installing.
Install required Operators by using the OpenShift console
Before configuring model servers or data science workbenches, you must prepare your OpenShift Container Platform cluster by installing foundational Operators from OperatorHub. These utilities enable essential service mesh, serverless, and hardware-detection frameworks. These are included at no additional cost beyond your standard Azure compute fees.
⚠️ IMPORTANT:
To resolve all dependencies, you must install the Operators in a specific sequence, and non-GPU Operators must be fully deployed before you install GPU Operators.
Procedure
-
In the OpenShift web console, navigate to Ecosystem → Software Catalog to add the non-GPU Operators. For each Operator, click Install, use the default installation settings, and click Install again.
⚠️ IMPORTANT: To avoid configuration errors, install these exact Operators in the following order:
- Red Hat OpenShift Service Mesh 3
- Red Hat OpenShift Serverless
- cert-manager Operator for Red Hat OpenShift
- Red Hat Connectivity Link
- Red Hat build of Leader Worker Set
- Red Hat build of Kueue
- Job Set Operator
-
After the non-GPU Operators are ready, install the following GPU Operators by navigating to Ecosystem → Software Catalog and clicking Install. Use the default settings, and click Install again:
- Node Feature Discovery Operator (NFD)
- NVIDIA GPU Operator
Verify that the Operators are successfully installed
Before you install RHOAI, ensure that your Operators are installed correctly. You can check in the OpenShift Console or the OpenShift CLI.
GUI procedure
- In the OpenShift console, go to OperatorHub, and click the Project menu.
- Toggle on the Show default projects switch**,** and select All Projects.
- Click Ecosystem → Installed Operators.
- Check the Operator Status in the table, or search for each Operator by name.
If the installation was successful, the Operators are displayed in the list of Operators, and their status is Succeeded.
CLI procedure
- Verify that each Operator has been installed successfully with the following command:
$ oc get csv -A | grep -E 'servicemesh|serverless|cert-manager|connectivity|leader-worker-set|kueue|jobset|nfd|gpu-operator'
Check that all Operators have status Succeeded.
⚠️ IMPORTANT Do not start installing RHOAI until all Operators show Succeeded. If any Operator remains in a Pending state, check the underlying namespace event logs to verify that cluster quotas have not been exceeded.
Install RHOAI and its components
You have installed the foundational Operators for RHOAI. When you install RHOAI, it automatically installs the additional components that it needs to run.
Procedure
-
In the OpenShift OperatorHub, navigate to Ecosystem → Software Catalog.
-
Search for OpenShift AI.
-
If multiple tiles are displayed, find this exact tile Red Hat OpenShift AI Provided by Red Hat, Inc., and click it.
![][image1] -
In the Channel field, select stable-3.x.
-
For Version, select 3.4.0 or the latest version.
-
Keep the default values for Installation mode and Installed Namespace (
redhat-ods-operator). -
Click Install.
-
If you have not created the Data Science Cluster already, click Create DataScienceCluster when the button is active. Click Create again.
The DataScienceCluster Initialization (DSCI) YAML file is created automatically. The DataScienceCluster YAML file is displayed.
-
Edit the DSC YAML file as needed. For example, if you want to add Llama Stack to your Data Science Cluster, change
RemovedtoManagedand click Save.
Example section of the DSC YAML file
spec:
trainer:
managementState: Managed
llamastackoperator:
managementState: Managed
trainingoperator:
managementState: Removed
Completing the installation might take a minute or longer depending on your environment.
Verification
When RHOAI and its components are completely installed, RHOAI has the status Succeeded on the OperatorHub and the DataScienceCluster has the status Ready.
-
To verify the RHOAI status, click Ecosystem → Installed Operators.
The RHOAI status should be Succeeded.
-
Click the link for Red Hat OpenShift AI.
The Provided APIs for RHOAI are displayed as tiles.
-
To verify that the Data Science Cluster is running, click the DataScienceCluster tab.
The DataScienceCluster should show Phase: Ready in the Status column. -
To see the details for the Data Science Cluster, click the default-dsc link.
Launch RHOAI
You are ready to launch RHOAI. Begin building, training, testing, and deploying both predictive and generative AI models across hybrid cloud environments.
Procedure
-
From the OpenShift console, click the Applications grid icon. !

-
Under OpenShift Self Managed Services, click Red Hat OpenShift AI, and log in.
