Product Documentation for Red Hat AI 3
Version:
What's New
-
Red Hat OpenShift AI 3 Release Notes
Highlights of what is new and what has changed with the latest OpenShift AI 3 release -
Red Hat AI Inference Server 3 Release Notes
Highlights of what is new and what has changed with the latest Red Hat AI Inference Server 3 release -
Red Hat Enterprise Linux AI 3 Release Notes
Highlights of what is new and what has changed with the latest Red Hat Enterprise Linux AI 3 release
Administer
-
Operate a governed, multi‑tenant AI platform at scale
Use CRDs or dashboard to publish images and provision resourced workbenches -
Administer OpenShift AI platform access, apps, and operations
Administer access, apps, resources, and accelerators; maintain logging, audit, and backups -
Manage and serve ML features with Feature Store
Use Feature Store to define, store, and serve reusable machine learning features to models -
Understand, control, and audit usage telemetry in OpenShift AI
Help administrators decide what usage data is collected, see what’s included, and enable or disable telemetry -
Provision hardware configurations and resources for projects
Enable supported hardware configurations for your data science workloads -
Configure single‑ and multi‑model serving for your cluster
Enable single‑model, multi‑model, or NVIDIA NIM serving platforms with serving runtimes and deployment modes -
Build AI/Agentic Applications with Llama Stack
Operate Llama Stack: activate the operator and expose OpenAI‑compatible RAG APIs -
Configure user access, storage, and telemetry in OpenShift AI
As an administrator, configure user access, customize the dashboard, and manage specialized resources for data science and AI engineering projects -
This content is not included.Enable the model registry to track, version, and deploy models
Enable the model registry so teams can register models and versions, capture metadata and provenance, and promote approved versions to serving with consistent governance -
Provision and secure access to model registries
Use the OpenShift AI dashboard to create registries, set access with RBAC groups, and manage model and version lifecycle so teams can register, share, and promote models to serving with traceability -
Choose production‑ready OpenShift AI APIs
Plan which APIs to build on and how to upgrade with minimal risk by mapping each OpenShift AI endpoint to a support tier that defines stability and deprecation timelines
Plan
-
Prepare your platform and hardware for Red Hat AI
Review compatibility matrices, accelerator support, deployment targets, and update policy prior to installation -
Choose a validated model for reliable serving
Explore the curated set of third‑party models validated for Red Hat AI products, ready for fast, reliable deployment
Discover
-
Discover Red Hat OpenShift AI 3
OpenShift AI 3 is a hybrid platform to build, serve, and monitor models at scale -
Discover Red Hat OpenShift AI 2
OpenShift AI 2 is a hybrid platform to build, serve, and monitor models at scale -
Discover Red Hat AI Inference Server 3
Serve LLMs with low latency on your preferred hardware, using vLLM optimizations -
Discover Red Hat Enterprise Linux AI 3
Serve and optimize your AI models on a Linux appliance, with low‑latency vLLM performance -
Discover Red Hat AI Enterprise
Red Hat AI Enterprise is an integrated enterprise environment for deploying, managing, and scaling AI model inference, training and tuning, and agentic AI workloads
Develop
-
Register, version, and promote models with the model registry
Store, version, and promote models with metadata for cross‑project sharing and traceability -
Discover, evaluate, register, and deploy models from the model catalog
Use the model catalog to discover, evaluate, register, and deploy models for rapid customization and testing -
Deploy the RAG stack for projects
Enable LlamaStack, GPUs, and vLLM, ingest data in a vector store and expose secure endpoints -
Experiment with RAG in the AI playground
Using the AI playground to experiment with RAG using models from your catalog -
Accelerate data processing and training with distributed workloads
Distribute data and ML jobs for faster results, larger datasets, and GPU‑aware auto‑scaling and monitoring -
Connect your workbench to S3-compatible object storage
Create a connection, configure an S3 client, and list, read, write, and copy objects from notebooks -
Organize projects, collaborate in workbenches, and deploy models
Organize projects, collaborate in workbenches, build notebooks, train/deploy models, and automate pipelines -
Use the Red Hat data science IDE images effectively
Launch a workbench, pick an IDE, and develop with prebuilt images or custom environments -
Build, schedule, and track machine learning pipelines
Define KFP‑based pipelines, version and schedule runs, and track artifacts in S3‑compatible storage -
Enable and manage connected applications from the OpenShift AI dashboard
Enable applications, connect with keys, remove unused tiles, and access Jupyter from the dashboard
Get started
-
Get started with projects, workbenches, and pipelines in OpenShift AI
Get set up to create projects, launch workbenches, and deploy your first model on OpenShift AI
Install
-
Deploy and decommission OpenShift AI on your cluster
Install via Operator or CLI, enable required components, verify the deployment, and cleanly uninstall when needed -
Deploy and decommission OpenShift AI in disconnected environments
Install via Operator or CLI, enable required components, verify the deployment, and cleanly uninstall when needed -
Upgrades are not supported in OpenShift AI 3.0
As the OpenShift AI 3.0 release introduces significant changes, and is a fast release, we want to ensure a smooth migration path from 2.x stable (eg 2.25) to the first stable 3.x release. As a result, upgrade from 2.x to 3.0 is not available -
Install Red Hat Enterprise Linux AI on bare metal and cloud
Deploy Red Hat Enterprise Linux AI using the bootable container image on servers or cloud -
Deploy the AI Inference Server container with GPU/TPU acceleration
Choose the container image for your accelerator, run the server, and confirm access to your GPUs/TPUs with a sample request
Train
-
Customize models to build generative AI applications
Customize AI models that are specific to your domain-specific use case, from setting up your development environment to building and deploying models for use in generative AI applications
Evaluate
-
Evaluating AI systems
Configure LMEvalJobs, select tasks, run evaluations, and retrieve metrics to compare model performance
Maintain Safety
-
Ensuring AI safety with guardrails
Orchestrate detectors to filter LLM inputs/outputs, auto‑configure security, and expose guarded endpoints
Monitor
-
Monitoring your AI Systems
Monitor model bias and data drift by configuring metrics, thresholds, and visualizations in OpenShift AI
Deploy
-
Deploy large models using the single-model serving platform (KServe RawDeployment)
Deploy models with KServe—choose RawDeployment or Knative, set resources and runtimes, and expose authenticated endpoints
Inference
-
Get started with Red Hat Enterprise Linux AI for inference
Get started with Red Hat Enterprise Linux AI 3, a generative AI inference platform for Linux environments that uses Red Hat AI Inference Server for running and optimizing models -
Deploy the AI Inference Server container with AI acceleration
Choose the container image for your accelerator, run the server, and confirm access to your AI acclerators with a sample request -
Deploy the AI Inference Server on OpenShift with supported accelerators
Install GPU operators, configure secrets and storage, deploy models, and expose secure inference endpoints -
Deploy the AI Inference Server in disconnected environments
Mirror required images, configure registry and secrets, and deploy secure inference endpoints offline -
Package, deploy, and serve OCI model containers on OpenShfit
Package models as OCI images, push to a registry, deploy, and serve on GPUs -
Tune vLLM server settings to optimize model serving
Choose and set key vLLM flags—parallelism, memory, batching, networking—to deploy reliable, performant endpoints -
Compress and optimize LLMs with the Red Hat AI Model Optimization Toolkit
Use LLM Compressor to apply quantization or sparsity and prepare compressed models for deployment
Learn
-
Red Hat AI Foundations
Follow one of the no-cost learning paths tailored to business leaders and technology learners in order to boost AI skills and confidence while earning Credly certificates -
This content is not included.Red Hat AI learning hub
Explore a curated collection of learning resources designed to help you accomplish key tasks with Red Hat AI products and services