Deploying Red Hat OpenShift sandboxed containers

OpenShift sandboxed containers 1.11

Enhanced security and isolation for container workloads

Red Hat Customer Content Services

Abstract

Red Hat OpenShift sandboxed containers provide enhanced security and isolation by running containerized applications in lightweight virtual machines. You install the OpenShift sandboxed containers Operator on an OpenShift Container Platform cluster. Then, you configure your workload pods to use the optional "kata" runtime.

Preface

Chapter 1. Overview

Learn about OpenShift sandboxed containers features and terminology. You must ensure that your OpenShift Container Platform environment is compatible.

Red Hat OpenShift sandboxed containers integrates Kata containers as an optional runtime, providing enhanced security and isolation by running containerized applications in lightweight virtual machines. This integration provides a more secure runtime environment for sensitive workloads without significant changes to existing OpenShift Container Platform workflows. This runtime supports containers in dedicated virtual machines (VMs), providing improved workload isolation.

1.1. Features

OpenShift sandboxed containers provides the following features:

Run privileged or untrusted workloads

You can safely run workloads that require specific privileges, without the risk of compromising cluster nodes by running privileged containers. Workloads that require special privileges include the following:

Workloads that require special capabilities from the kernel, beyond the default ones granted by standard container runtimes such as CRI-O, for example to access low-level networking features.
Workloads that need elevated root privileges, for example to access a specific physical device. With OpenShift sandboxed containers, it is possible to pass only a specific device through to the virtual machines (VM), ensuring that the workload cannot access or misconfigure the rest of the system.
Workloads for installing or using set-uid root binaries. These binaries grant special privileges and, as such, can present a security risk. With OpenShift sandboxed containers, additional privileges are restricted to the virtual machines, and grant no special access to the cluster nodes.
Some workloads require privileges specifically for configuring the cluster nodes. Such workloads should still use privileged containers, because running on a virtual machine would prevent them from functioning.

Ensure isolation for sensitive workloads

The OpenShift sandboxed containers for Red Hat OpenShift Container Platform integrates Kata containers as an optional runtime, providing enhanced security and isolation by running containerized applications in lightweight virtual machines. This integration provides a more secure runtime environment for sensitive workloads without significant changes to existing OpenShift workflows. This runtime supports containers in dedicated virtual machines (VMs), providing improved workload isolation.

Ensure kernel isolation for each workload

You can run workloads that require custom kernel tuning (such as sysctl, scheduler changes, or cache tuning) and the creation of custom kernel modules (such as out of tree or special arguments).

Share the same workload across tenants

You can run workloads that support many users (tenants) from different organizations sharing the same OpenShift Container Platform cluster. The system also supports running third-party workloads from multiple vendors, such as container network functions (CNFs) and enterprise applications. Third-party CNFs, for example, may not want their custom settings interfering with packet tuning or with sysctl variables set by other applications. Running inside a completely isolated kernel is helpful in preventing "noisy neighbor" configuration problems.

Ensure proper isolation and sandboxing for testing software

You can run containerized workloads with known vulnerabilities or handle issues in an existing application. This isolation enables administrators to give developers administrative control over pods, which is useful when the developer wants to test or validate configurations beyond those an administrator would typically grant. Administrators can, for example, safely and securely delegate kernel packet filtering (eBPF) to developers. eBPF requires CAP_ADMIN or CAP_BPF privileges, and is therefore not allowed under a standard CRI-O configuration, as this would grant access to every process on the Container Host worker node. Similarly, administrators can grant access to intrusive tools such as SystemTap, or support the loading of custom kernel modules during their development.

Ensure default resource containment through VM boundaries

By default, OpenShift sandboxed containers manages resources such as CPU, memory, storage, and networking in a robust and secure way. Since OpenShift sandboxed containers deploys on VMs, additional layers of isolation and security give a finer-grained access control to the resource. For example, an errant container will not be able to assign more memory than is available to the VM. Conversely, a container that needs dedicated access to a network card or to a disk can take complete control over that device without getting any access to other devices.

1.2. Compatibility with OpenShift Container Platform

The required functionality for Red Hat OpenShift Container Platform is supported by two main components:

Kata runtime: The Kata runtime is included with Red Hat Enterprise Linux CoreOS (RHCOS) and receives updates with every OpenShift Container Platform release. When enabling peer pods with the Kata runtime, the OpenShift sandboxed containers Operator requires external network connectivity to pull the necessary image components and helper utilities to create the pod virtual machine (VM) image.
OpenShift sandboxed containers Operator: The OpenShift sandboxed containers Operator is a Rolling Stream Operator, which means the latest version is the only supported version. It works with all currently supported versions of OpenShift Container Platform.

The Operator depends on the features that come with the RHCOS host and the environment it runs in.

Note

You must install RHCOS on the worker nodes. Red Hat Enterprise Linux (RHEL) nodes are not supported.

The following compatibility matrix for OpenShift sandboxed containers and OpenShift Container Platform releases identifies compatible features and environments.

Table 1.1. Supported architectures

Architecture	OpenShift Container Platform version
x86_64	4.17 or later
s390x	4.17 or later

There are two ways to deploy the Kata containers runtime:

Bare metal
Peer pods

You can deploy OpenShift sandboxed containers by using peer pods on Microsoft Azure, AWS Cloud Computing Services, or Google Cloud. With the release of OpenShift sandboxed containers 1.11, the OpenShift sandboxed containers Operator requires OpenShift Container Platform version 4.17 or later.

Table 1.2. Feature availability by OpenShift Container Platform version

Major release version		4.17	4.18	4.19	4.20
Minor release version		4.17.45+	4.18.30+	4.19.20+	4.20.6+
Feature	Platform
Confidential containers	Bare metal	—	—	—	TP
	Azure peer pods	GA	GA	GA	GA
	IBM Z peer pods	TP	TP	TP	TP
	IBM Z bare metal	—	—	—	TP
GPU support	Bare metal	—	—	—	—
	Azure	DP	DP	DP	DP
	AWS	DP	DP	DP	DP
	Google Cloud	DP	DP	DP	DP

Important

GPU support for peer pods is a Developer Preview feature only. Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview features for production or business-critical workloads. Developer Preview features provide early access to upcoming product features in advance of their possible inclusion in a Red Hat product offering, enabling customers to test functionality and provide feedback during the development process. These features might not have any documentation, are subject to change or removal at any time, and testing is limited. Red Hat might provide ways to submit feedback on Developer Preview features without an associated SLA.

Table 1.3. Supported cloud platforms

Platform	GPU	Confidential containers
Azure	DP	GA
AWS	DP	—
Google Cloud	DP	—

Table 1.4. Supported on-premise platforms

Platform	GPU	Confidential containers
Bare metal	—	TP
IBM Z	—	TP

Additional resources

1.3. Common terms

The following terms are used throughout the documentation.

Sandbox

A sandbox is an isolated environment where programs can run. In a sandbox, you can run untested or untrusted programs without risking harm to the host machine or the operating system.

In the context of OpenShift sandboxed containers, sandboxing is achieved by running workloads in a different kernel using virtualization, providing enhanced control over the interactions between multiple workloads that run on the same host.

Pod

A pod is a construct that is inherited from Kubernetes and OpenShift Container Platform. It represents resources where containers can be deployed. Containers run inside pods, and pods are used to specify resources that can be shared between multiple containers.

In the context of OpenShift sandboxed containers, a pod is implemented as a virtual machine. Several containers can run in the same pod on the same virtual machine.

OpenShift sandboxed containers Operator

The OpenShift sandboxed containers Operator manages the lifecycle of sandboxed containers on a cluster. You can use the OpenShift sandboxed containers Operator to perform tasks such as the installation and removal of sandboxed containers, software updates, and status monitoring.

Kata containers

Kata containers is a core upstream project that is used to build OpenShift sandboxed containers. OpenShift sandboxed containers integrate Kata containers with OpenShift Container Platform.

KataConfig

KataConfig objects represent configurations of sandboxed containers. They store information about the state of the cluster, such as the nodes on which the software is deployed.

Runtime class

A RuntimeClass object describes the runtime that is used to run a given workload. The kata runtime class is installed and deployed by the OpenShift sandboxed containers Operator. The runtime class contains information about the runtime that describes resources that the runtime needs to operate, such as the Content from kubernetes.io is not included.pod overhead.

Peer pod

A peer pod in OpenShift sandboxed containers extends the concept of a standard pod. Unlike a standard sandboxed container, where the virtual machine is created on the worker node itself, in a peer pod, the virtual machine is created through a remote hypervisor using any supported hypervisor or cloud provider API.

The peer pod acts as a regular pod on the worker node, with its corresponding VM running elsewhere. The remote location of the VM is transparent to the user and is specified by the runtime class in the pod specification. The peer pod design circumvents the need for nested virtualization.

IBM Secure Execution

IBM Secure Execution for Linux is an advanced security feature introduced with IBM z15® and LinuxONE III. This feature extends the protection provided by pervasive encryption. IBM Secure Execution safeguards data at rest, in transit, and in use. It enables secure deployment of workloads and ensures data protection throughout its lifecycle. For more information, see Content from www.ibm.com is not included.Introducing IBM Secure Execution for Linux.

Confidential containers

Confidential containers protects containers and data by verifying that your workload is running in a Trusted Execution Environment (TEE). You can deploy this feature to safeguard the privacy of big data analytics and machine learning inferences.

Red Hat build of Trustee

The Red Hat build of Trustee is an attestation service that verifies the trustworthiness of the location where you plan to run your workload or where you plan to send confidential information. Red Hat build of Trustee includes components deployed on a trusted side and used to verify whether the remote workload is running in a Trusted Execution Environment (TEE). Red Hat build of Trustee is flexible and can be deployed in several different configurations to support a wide variety of applications and hardware platforms.

Red Hat build of Trustee Operator

The Red Hat build of Trustee Operator manages the installation, lifecycle, and configuration of Red Hat build of Trustee.

1.4. OpenShift sandboxed containers Operator

The OpenShift sandboxed containers Operator encapsulates all of the components from Kata containers. It manages installation, lifecycle, and configuration tasks.

The OpenShift sandboxed containers Operator is packaged in the This content is not included.Operator bundle format as two container images:

The bundle image contains metadata and is required to make the operator OLM-ready.
The second container image contains the actual controller that monitors and manages the KataConfig resource.

The OpenShift sandboxed containers Operator is based on the Red Hat Enterprise Linux CoreOS (RHCOS) extensions concept. RHCOS extensions are a mechanism to install optional OpenShift Container Platform software. The OpenShift sandboxed containers Operator uses this mechanism to deploy sandboxed containers on a cluster.

The sandboxed containers RHCOS extension contains RPMs for Kata, QEMU, and its dependencies. You can enable them by using the MachineConfig resources that the Machine Config Operator provides.

Additional resources

This content is not included.Adding extensions to RHCOS

1.5. OpenShift Virtualization

You can deploy OpenShift sandboxed containers on clusters with OpenShift Virtualization.

To run OpenShift Virtualization and OpenShift sandboxed containers at the same time, your virtual machines must be live migratable so that they do not block node reboots. See This content is not included.About live migration in the OpenShift Virtualization documentation for details.

1.6. FIPS compliance

OpenShift Container Platform is designed for Federal Information Processing Standards (FIPS) 140-2 and 140-3. When running Red Hat Enterprise Linux (RHEL) or Red Hat Enterprise Linux CoreOS (RHCOS) booted in FIPS mode, OpenShift Container Platform core components use the RHEL cryptographic libraries that have been submitted to NIST for FIPS 140-2/140-3 Validation on only the x86_64, ppc64le, and s390x architectures.

For more information about the NIST validation program, see Content from csrc.nist.gov is not included.Cryptographic Module Validation Program. For the latest NIST status for the individual versions of RHEL cryptographic libraries that have been submitted for validation, see This content is not included.Compliance Activities and Government Standards.

OpenShift sandboxed containers can be used on FIPS enabled clusters.

When running in FIPS mode, OpenShift sandboxed containers components, VMs, and VM images are adapted to comply with FIPS.

Note

FIPS compliance for OpenShift sandboxed containers only applies to the kata runtime class. The peer pod runtime class, kata-remote, is not yet fully supported and has not been tested for FIPS compliance.

FIPS compliance is one of the most critical components required in highly secure environments, to ensure that only supported cryptographic technologies are allowed on nodes.

Important

The use of FIPS Validated / Modules in Process cryptographic libraries is only supported on OpenShift Container Platform deployments on the x86_64 architecture.

To understand Red Hat’s view of OpenShift Container Platform compliance frameworks, refer to the Risk Management and Regulatory Readiness chapter of the OpenShift Security Guide Book.

1.7. Providing feedback on Red Hat documentation

You can provide feedback or report an error by submitting the Create Issue form in Jira:

Ensure that you are logged in to Jira. If you do not have a Jira account, you must create a This content is not included.Red Hat Jira account.
Launch the This content is not included.Create Issue form.
Complete the Summary, Description, and Reporter fields.
In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
Click Create.

Chapter 2. Deploying OpenShift sandboxed containers on bare metal

You can deploy OpenShift sandboxed containers on bare metal.

2.1. Preparation

Review the following prerequisites and concepts before you deploy OpenShift sandboxed containers on bare metal.

2.1.1. Prerequisites

You have installed the latest version of Red Hat OpenShift Container Platform.
Your OpenShift Container Platform cluster has at least one worker node.

Additional resources

This content is not included.Installing OpenShift Container Platform on bare metal

2.1.2. Node eligibility checks

You can verify that your bare-metal cluster nodes support OpenShift sandboxed containers by running a node eligibility check. The most common reason for node ineligibility is lack of virtualization support. If you run sandboxed workloads on ineligible nodes, you will experience errors.

High-level workflow

Install the Node Feature Discovery Operator.
Create the NodeFeatureDiscovery custom resource (CR).
Enable node eligibility checks when you create the Kataconfig CR. You can run node eligibility checks on all worker nodes or on selected nodes.

Additional resources

This content is not included.Installing the Node Feature Discovery Operator

2.1.3. Block volume support

OpenShift Container Platform can statically provision raw block volumes. These volumes do not have a file system, and can provide performance benefits for applications that either write to the disk directly or implement their own storage service.

You can use a local block device as persistent volume (PV) storage with OpenShift sandboxed containers. This block device can be provisioned by using the Local Storage Operator (LSO).

The Local Storage Operator is not installed in OpenShift Container Platform by default. See This content is not included.Installing the Local Storage Operator for installation instructions.

You can provision raw block volumes for OpenShift sandboxed containers by specifying volumeMode: Block in the PV specification.

Block volume example

apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "local-disks"
  namespace: "openshift-local-storage"
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-0
  storageClassDevices:
    - storageClassName: "local-sc"
      forceWipeDevicesAndDestroyAllData: false
      volumeMode: Block 1
      devicePaths:
        - /path/to/device 2

1: Set volumeMode to Block to indicate that this PV is a raw block volume.
2: Replace this value with the filepath to your LocalVolume resource by-id. PVs are created for these local disks when the provisioner is deployed successfully. You must also use this path to label the node that uses the block device when deploying OpenShift sandboxed containers.

2.2. Deployment overview

You deploy OpenShift sandboxed containers on bare metal by performing the following steps:

Install the OpenShift sandboxed containers Operator on the OpenShift Container Platform cluster.
Optional: Install the Node Feature Discovery (NFD) Operator to configure node eligibility checks.
Optional: Install the Local Storage Operator to configure a local block storage device.
Create the KataConfig custom resource.
Optional: Modify the number of virtual machines running on each worker node.
Optional: Modify the pod overhead.
Configure your workload for OpenShift sandboxed containers.

2.3. Installing the OpenShift sandboxed containers Operator

You install the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Prerequisites

You have access to the cluster as a user with the cluster-admin role.

Procedure

Create an osc-namespace.yaml manifest file:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-sandboxed-containers-operator

Create the namespace by running the following command:
```
$ oc apply -f osc-namespace.yaml
```

Create an osc-operatorgroup.yaml manifest file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: sandboxed-containers-operator-group
  namespace: openshift-sandboxed-containers-operator
spec:
  targetNamespaces:
  - openshift-sandboxed-containers-operator

Create the operator group by running the following command:
```
$ oc apply -f osc-operatorgroup.yaml
```

Create an osc-subscription.yaml manifest file:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sandboxed-containers-operator
  namespace: openshift-sandboxed-containers-operator
spec:
  channel: stable
  installPlanApproval: Automatic
  name: sandboxed-containers-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: sandboxed-containers-operator.v1.11.1

Create the subscription by running the following command:
```
$ oc create -f osc-subscription.yaml
```
Verify that the Operator is correctly installed by running the following command:
```
$ oc get csv -n openshift-sandboxed-containers-operator
```
This command can take several minutes to complete.

Watch the process by running the following command:

$ watch oc get csv -n openshift-sandboxed-containers-operator

Example output

NAME                             DISPLAY                                  VERSION             REPLACES                   PHASE
openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.11.1    1.11.0        Succeeded

2.4. Optional configurations

You can configure the following options after you install the OpenShift sandboxed containers Operator.

2.4.1. Creating a NodeFeatureDiscovery custom resource

You create a NodeFeatureDiscovery custom resource (CR) to define the configuration parameters that the Node Feature Discovery (NFD) Operator checks to determine that the worker nodes can support OpenShift sandboxed containers.

Note

To install the kata runtime on only selected worker nodes that you know are eligible, apply the feature.node.kubernetes.io/runtime.kata=true label to the selected nodes and set checkNodeEligibility: true in the KataConfig CR.

To install the kata runtime on all worker nodes, set checkNodeEligibility: false in the KataConfig CR.

In both these scenarios, you do not need to create the NodeFeatureDiscovery CR. You should only apply the feature.node.kubernetes.io/runtime.kata=true label manually if you are sure that the node is eligible to run OpenShift sandboxed containers.

The following procedure applies the feature.node.kubernetes.io/runtime.kata=true label to all eligible nodes and configures the KataConfig resource to check for node eligibility.

Prerequisites

You have installed the NFD Operator. For more information, see This content is not included.Node Feature Discovery Operator in the OpenShift Container Platform documentation.

Procedure

Create a my-nfd.yaml manifest file according to the following example:

apiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
  name: nfd-instance
  namespace: openshift-nfd
spec:
  operand:
    image: registry.redhat.io/openshift4/ose-node-feature-discovery-rhel9:v4.20
    imagePullPolicy: Always
    servicePort: 12000
  workerConfig:
    configData: |
      sources:
        custom:
          - name: "feature.node.kubernetes.io/runtime.kata"
            matchOn:
              - cpuId: ["SSE4", "VMX"]
                loadedKMod: ["kvm", "kvm_intel"]
              - cpuId: ["SSE4", "SVM"]
                loadedKMod: ["kvm", "kvm_amd"]

Create the NodeFeatureDiscovery CR:
```
$ oc create -f my-nfd.yaml
```
The NodeFeatureDiscovery CR applies the feature.node.kubernetes.io/runtime.kata=true label to all qualifying worker nodes.

Verification

Verify that qualifying nodes in the cluster have the correct label applied:

$ oc get nodes --selector='feature.node.kubernetes.io/runtime.kata=true'

Example output

NAME                           STATUS                     ROLES    AGE     VERSION
compute-3.example.com          Ready                      worker   4h38m   v1.25.0
compute-2.example.com          Ready                      worker   4h35m   v1.25.0

2.4.2. Creating the NodeFeatureRule custom resource

You can use NodeFeatureRule objects to automatically apply custom labels and taints to your nodes based on specific rules. This helps you set up node identification and scheduling for your applications or hardware, ensuring the best placement of workloads and efficient use of resources.

Procedure

Create a custom resource manifest named my-nodefeaturerule.yaml:

apiVersion: nfd.openshift.io/v1alpha1
kind: NodeFeatureRule
metadata:
  name: consolidated-hardware-features
  namespace: openshift-nfd
spec:
  rules:
    - name: "runtime.kata"
      labels:
        feature.node.kubernetes.io/runtime.kata: "true"
      matchAny:
        - matchFeatures:
            - feature: cpu.cpuid
              matchExpressions:
                SSE42: { op: Exists }
                VMX: { op: Exists }
            - feature: kernel.loadedmodule
              matchExpressions:
                kvm: { op: Exists }
                kvm_intel: { op: Exists }
        - matchFeatures:
            - feature: cpu.cpuid
              matchExpressions:
                SSE42: { op: Exists }
                SVM: { op: Exists }
            - feature: kernel.loadedmodule
              matchExpressions:
                kvm: { op: Exists }
                kvm_amd: { op: Exists }

    - name: "amd.sev-snp"
      labels:
        amd.feature.node.kubernetes.io/snp: "true"
      extendedResources:
        sev-snp.amd.com/esids: "@cpu.security.sev.encrypted_state_ids"
      matchFeatures:
        - feature: cpu.cpuid
          matchExpressions:
            SVM: { op: Exists }
        - feature: cpu.security
          matchExpressions:
            sev.snp.enabled: { op: Exists }

    - name: "intel.sgx"
      labels:
        intel.feature.node.kubernetes.io/sgx: "true"
      extendedResources:
        sgx.intel.com/epc: "@cpu.security.sgx.epc"
      matchFeatures:
        - feature: cpu.cpuid
          matchExpressions:
            SGX: { op: Exists }
            SGXLC: { op: Exists }
        - feature: cpu.security
          matchExpressions:
            sgx.enabled: { op: IsTrue }
        - feature: kernel.config
          matchExpressions:
            X86_SGX: { op: Exists }

    - name: "intel.tdx"
      labels:
        intel.feature.node.kubernetes.io/tdx: "true"
      extendedResources:
        tdx.intel.com/keys: "@cpu.security.tdx.total_keys"
      matchFeatures:
        - feature: cpu.cpuid
          matchExpressions:
            VMX: { op: Exists }
        - feature: cpu.security
          matchExpressions:
            tdx.enabled: { op: Exists }

    - name: "ibm.se.enabled"
      labels:
        ibm.feature.node.kubernetes.io/se: "true"
      matchFeatures:
        - feature: cpu.security
          matchExpressions:
            se.enabled: { op: IsTrue }

Create the NodeFeatureRule CR by running the following command:
```
$ oc create -f my-nodefeaturerule.yaml
```

Note

A relabeling delay of up to 1 minute might occur.

2.4.3. Provisioning local block volumes

You can use local block volumes with OpenShift sandboxed containers. You must first provision the local block volumes by using the Local Storage Operator (LSO). Then you must enable the nodes with the local block volumes to run OpenShift sandboxed containers workloads.

You can provision local block volumes for OpenShift sandboxed containers by using the Local Storage Operator (LSO). The local volume provisioner looks for any block volume devices at the paths specified in the defined resource.

Prerequisites

You have installed the Local Storage Operator.
You have a local disk that meets the following conditions:
- It is attached to a node.
- It is not mounted.
- It does not contain partitions.

Procedure

Create the local volume resource. This resource must define the nodes and paths to the local volumes.
Note
Do not use different storage class names for the same device. Doing so creates multiple persistent volumes (PVs).
Example: Block
```
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "local-disks"
  namespace: "openshift-local-storage" 1
spec:
  nodeSelector: 2
    nodeSelectorTerms:
    - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ip-10-0-136-143
          - ip-10-0-140-255
          - ip-10-0-144-180
  storageClassDevices:
    - storageClassName: "local-sc" 3
      forceWipeDevicesAndDestroyAllData: false 4
      volumeMode: Block
      devicePaths: 5
        - /path/to/device 6
```
1
The namespace where the Local Storage Operator is installed.
2
Optional: A node selector containing a list of nodes where the local storage volumes are attached. This example uses the node hostnames, obtained from oc get node. If a value is not defined, then the Local Storage Operator will attempt to find matching disks on all available nodes.
3
The name of the storage class to use when creating persistent volume objects.
4
This setting defines whether or not to call wipefs, which removes partition table signatures (magic strings) making the disk ready to use for Local Storage Operator provisioning. No other data besides signatures is erased. The default is "false" (wipefs is not invoked). Setting forceWipeDevicesAndDestroyAllData to "true" can be useful in scenarios where previous data can remain on disks that need to be re-used. In these scenarios, setting this field to true eliminates the need for administrators to erase the disks manually.
5
The path containing a list of local storage devices to choose from. You must use this path when enabling a node with a local block device to run OpenShift sandboxed containers workloads.
6
Replace this value with the filepath to your LocalVolume resource by-id, such as /dev/disk/by-id/wwn. PVs are created for these local disks when the provisioner is deployed successfully.
Create the local volume resource in your OpenShift Container Platform cluster. Specify the file you just created:
```
$ oc create -f <local-volume>.yaml
```

Verify that the provisioner was created and that the corresponding daemon sets were created:

$ oc get all -n openshift-local-storage

Example output

NAME                                          READY   STATUS    RESTARTS   AGE
pod/diskmaker-manager-9wzms                   1/1     Running   0          5m43s
pod/diskmaker-manager-jgvjp                   1/1     Running   0          5m43s
pod/diskmaker-manager-tbdsj                   1/1     Running   0          5m43s
pod/local-storage-operator-7db4bd9f79-t6k87   1/1     Running   0          14m

NAME                                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/local-storage-operator-metrics   ClusterIP   172.30.135.36   <none>        8383/TCP,8686/TCP   14m

NAME                               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/diskmaker-manager   3         3         3       3            3           <none>          5m43s

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/local-storage-operator   1/1     1            1           14m

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/local-storage-operator-7db4bd9f79   1         1         1       14m

Note the desired and current number of daemon set processes. A desired count of 0 indicates that the label selectors were invalid.

Verify that the persistent volumes were created:

$ oc get pv

Example output

NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
local-pv-1cec77cf   100Gi      RWO            Delete           Available           local-sc                88m
local-pv-2ef7cd2a   100Gi      RWO            Delete           Available           local-sc                82m
local-pv-3fa1c73    100Gi      RWO            Delete           Available           local-sc                48m

Important

Editing the LocalVolume object does not change existing persistent volumes because doing so might result in a destructive operation.

Additional resources

This content is not included.Persistent storage using local volumes

2.4.4. Enabling nodes to use a local block device

You can configure nodes with a local block device to run OpenShift sandboxed containers workloads at the paths specified in the defined volume resource.

Prerequisites

You provisioned a block device using the Local Storage Operator (LSO).

Procedure

Enable each node with a local block device to run OpenShift sandboxed containers workloads by running the following command:
```
$ oc debug node/worker-0 -- chcon -vt container_file_t /host/path/to/device
```
The /path/to/device must be the same path you defined when creating the local storage resource.
Example output
```
system_u:object_r:container_file_t:s0 /host/path/to/device
```

Additional resources

This content is not included.Provisioning local volumes by using the Local Storage Operator

2.5. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: false
  checkNodeEligibility: false
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata on specific nodes, specify the key and value, for example, kata: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata as a runtime class on the worker nodes.
Wait for the kata installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata is installed on the cluster.
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata.

2.6. Modifying pod overhead

Content from kubernetes.io is not included.Pod overhead describes the amount of system resources that a pod on a node uses. You can modify the pod overhead by changing the spec.overhead field for a RuntimeClass custom resource. For example, if the configuration that you run for your containers consumes more than 350Mi of memory for the QEMU process and guest kernel data, you can alter the RuntimeClass overhead to suit your needs.

When performing any kind of file system I/O in the guest, file buffers are allocated in the guest kernel. The file buffers are also mapped in the QEMU process on the host, as well as in the virtiofsd process.

For example, if you use 300Mi of file buffer cache in the guest, both QEMU and virtiofsd appear to use 300Mi additional memory. However, the same memory is being used in all three cases. Therefore, the total memory usage is only 300Mi, mapped in three different places. This is correctly accounted for when reporting the memory utilization metrics.

Note

The default values are supported by Red Hat. Changing default overhead values is not supported and can result in technical issues.

Procedure

Obtain the RuntimeClass object by running the following command:
```
$ oc describe runtimeclass kata
```

Update the overhead.podFixed.memory and cpu values and save as the file as runtimeclass.yaml:

kind: RuntimeClass
apiVersion: node.k8s.io/v1
metadata:
  name: kata
overhead:
  podFixed:
    memory: "500Mi"
    cpu: "500m"

Apply the changes by running the following command:
```
$ oc apply -f runtimeclass.yaml
```

2.7. Configuring your workload for OpenShift sandboxed containers

You configure your workload for OpenShift sandboxed containers by setting kata as the runtime class for the following pod-templated objects:

Pod objects
ReplicaSet objects
ReplicationController objects
StatefulSet objects
Deployment objects
DeploymentConfig objects

Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

Prerequisites

You have created the KataConfig custom resource (CR).

Procedure

Add spec.runtimeClassName: kata to the manifest of each pod-templated workload object as in the following example:
```
apiVersion: v1
kind: <object>
# ...
spec:
  runtimeClassName: kata
# ...
```
Apply the changes to the workload object by running the following command:
```
$ oc apply -f <object.yaml>
```
OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata, then the workload is running on OpenShift sandboxed containers.

Chapter 3. Deploying OpenShift sandboxed containers on AWS

You can deploy OpenShift sandboxed containers on AWS Cloud Computing Services.

Important

Red Hat OpenShift sandboxed containers on AWS Cloud Computing Services is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

3.1. Preparation

Review the following prerequisites and concepts before you deploy OpenShift sandboxed containers on AWS.

3.1.1. Prerequisites

You have installed the latest version of Red Hat OpenShift Container Platform.
Your OpenShift Container Platform cluster has at least one worker node.
You have enabled ports 15150 and 9000 for communication in the subnet used for worker nodes and the pod virtual machine (VM). The ports enable communication between the Kata shim running on the worker node and the Kata agent running on the pod VM.

Additional resources

This content is not included.Installing OpenShift Container Platform on AWS

3.1.2. Peer pod resource requirements

You must ensure that your cluster has sufficient resources.

Peer pod virtual machines (VMs) require resources in two locations:

The worker node. The worker node stores metadata, Kata shim resources (containerd-shim-kata-v2), remote-hypervisor resources (cloud-api-adaptor), and the tunnel setup between the worker nodes and the peer pod VM.
The cloud instance. This is the actual peer pod VM running in the cloud.

The CPU and memory resources used in the Kubernetes worker node are handled by the Content from kubernetes.io is not included.pod overhead included in the RuntimeClass (kata-remote) definition used for creating peer pods.

The total number of peer pod VMs running in the cloud is defined as Kubernetes Node extended resources. This limit is per node and is set by the PEERPODS_LIMIT_PER_NODE attribute in the peer-pods-cm config map.

The extended resource is named kata.peerpods.io/vm, and enables the Kubernetes scheduler to handle capacity tracking and accounting.

You can edit the limit per node based on the requirements for your environment after you install the OpenShift sandboxed containers Operator.

A Content from kubernetes.io is not included.mutating webhook adds the extended resource kata.peerpods.io/vm to the pod specification. It also removes any resource-specific entries from the pod specification, if present. This enables the Kubernetes scheduler to account for these extended resources, ensuring the peer pod is only scheduled when resources are available.

The mutating webhook modifies a Kubernetes pod as follows:

The mutating webhook checks the pod for the expected RuntimeClassName value, specified in the TARGET_RUNTIMECLASS environment variable. If the value in the pod specification does not match the value in the TARGET_RUNTIMECLASS, the webhook exits without modifying the pod.
If the RuntimeClassName values match, the webhook makes the following changes to the pod spec:
1. The webhook removes every resource specification from the resources field of all containers and init containers in the pod.
2. The webhook adds the extended resource (kata.peerpods.io/vm) to the spec by modifying the resources field of the first container in the pod. The extended resource kata.peerpods.io/vm is used by the Kubernetes scheduler for accounting purposes.

Note

The mutating webhook excludes specific system namespaces in OpenShift Container Platform from mutation. If a peer pod is created in those system namespaces, then resource accounting using Kubernetes extended resources does not work unless the pod spec includes the extended resource.

As a best practice, define a cluster-wide policy to only allow peer pod creation in specific namespaces.

3.2. Deployment overview

You deploy OpenShift sandboxed containers on AWS by performing the following steps:

Install the OpenShift sandboxed containers Operator on the OpenShift Container Platform cluster.
Enable ports to allow internal communication with peer pods.
Create the peer pods config map.
Create the KataConfig custom resource.
Optional: Modify the number of virtual machines running on each worker node.
Disable insecure options by customizing the Kata Agent policy.
Optional: If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.
Optional: You can select a custom peer pod VM image.
Configure your workload for OpenShift sandboxed containers.

3.3. Installing the OpenShift sandboxed containers Operator

You install the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Prerequisites

You have access to the cluster as a user with the cluster-admin role.

Procedure

Create an osc-namespace.yaml manifest file:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-sandboxed-containers-operator

Create the namespace by running the following command:
```
$ oc apply -f osc-namespace.yaml
```

Create an osc-operatorgroup.yaml manifest file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: sandboxed-containers-operator-group
  namespace: openshift-sandboxed-containers-operator
spec:
  targetNamespaces:
  - openshift-sandboxed-containers-operator

Create the operator group by running the following command:
```
$ oc apply -f osc-operatorgroup.yaml
```

Create an osc-subscription.yaml manifest file:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sandboxed-containers-operator
  namespace: openshift-sandboxed-containers-operator
spec:
  channel: stable
  installPlanApproval: Automatic
  name: sandboxed-containers-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: sandboxed-containers-operator.v1.11.1

Create the subscription by running the following command:
```
$ oc create -f osc-subscription.yaml
```
Verify that the Operator is correctly installed by running the following command:
```
$ oc get csv -n openshift-sandboxed-containers-operator
```
This command can take several minutes to complete.

Watch the process by running the following command:

$ watch oc get csv -n openshift-sandboxed-containers-operator

Example output

NAME                             DISPLAY                                  VERSION             REPLACES                   PHASE
openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.11.1    1.11.0        Succeeded

3.4. Enabling ports for AWS

You must enable ports 15150 and 9000 to allow internal communication with peer pods running on AWS.

Prerequisites

You have installed the OpenShift sandboxed containers Operator.
You have installed the AWS command line tool.
You have access to the cluster as a user with the cluster-admin role.

Procedure

$ INSTANCE_ID=$(oc get nodes -l 'node-role.kubernetes.io/worker' \
  -o jsonpath='{.items[0].spec.providerID}' | sed 's#[^ ]*/##g')

Retrieve the AWS region:

$ AWS_REGION=$(oc get infrastructure/cluster -o jsonpath='{.status.platformStatus.aws.region}')

Retrieve the security group IDs and store them in an array:

$ AWS_SG_IDS=($(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
  --query 'Reservations[*].Instances[*].SecurityGroups[*].GroupId' \
  --output text --region $AWS_REGION))

For each security group ID, authorize the peer pods shim to access kata-agent communication, and set up the peer pods tunnel:

$ for AWS_SG_ID in "${AWS_SG_IDS[@]}"; do \
  aws ec2 authorize-security-group-ingress --group-id $AWS_SG_ID --protocol tcp --port 15150 --source-group $AWS_SG_ID --region $AWS_REGION; \
  aws ec2 authorize-security-group-ingress --group-id $AWS_SG_ID --protocol tcp --port 9000 --source-group $AWS_SG_ID --region $AWS_REGION; \
done

The ports are now enabled.

3.5. Creating the peer pods config map

You must create the peer pods config map.

Prerequisites

You have an Amazon Machine Image (AMI) ID if you are not using the default AMI ID based on your cluster credentials.

Procedure

Obtain the following values from your AWS instance:

Retrieve and record the instance ID:

$ INSTANCE_ID=$(oc get nodes -l 'node-role.kubernetes.io/worker' \
  -o jsonpath='{.items[0].spec.providerID}' | sed 's#[^ ]*/##g')

This is used to retrieve other values for the secret object.

Retrieve and record the AWS region:

$ AWS_REGION=$(oc get infrastructure/cluster \
  -o jsonpath='{.status.platformStatus.aws.region}') \
  && echo "AWS_REGION: \"$AWS_REGION\""

Retrieve and record the AWS subnet ID:

$ AWS_SUBNET_ID=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
  --query 'Reservations[*].Instances[*].SubnetId' --region ${AWS_REGION} \
    --output text) && echo "AWS_SUBNET_ID: \"$AWS_SUBNET_ID\""

Retrieve and record the AWS VPC ID:

$ AWS_VPC_ID=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
  --query 'Reservations[*].Instances[*].VpcId' --region ${AWS_REGION} \
    --output text) && echo "AWS_VPC_ID: \"$AWS_VPC_ID\""

Retrieve and record the AWS security group IDs:

$ AWS_SG_IDS=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} \
  --query 'Reservations[*].Instances[*].SecurityGroups[*].GroupId' \
  --region  $AWS_REGION --output json | jq -r '.[][][]' | paste -sd ",") \
    && echo "AWS_SG_IDS: \"$AWS_SG_IDS\""

Create a peer-pods-cm.yaml manifest file according to the following example:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: peer-pods-cm
  namespace: openshift-sandboxed-containers-operator
data:
  CLOUD_PROVIDER: "aws"
  VXLAN_PORT: "9000"
  PROXY_TIMEOUT: "5m"
  PODVM_INSTANCE_TYPE: "t3.medium"
  PODVM_INSTANCE_TYPES: "t2.small,t2.medium,t3.large"
  PODVM_AMI_ID: "<podvm_ami_id>"
  AWS_REGION: "<aws_region>"
  AWS_SUBNET_ID: "<aws_subnet_id>"
  AWS_VPC_ID: "<aws_vpc_id>"
  AWS_SG_IDS: "<aws_sg_ids>"
  TAGS: "key1=value1,key2=value2"
  PEERPODS_LIMIT_PER_NODE: "10"
  ROOT_VOLUME_SIZE: "6"
  DISABLECVM: "true"
```
PODVM_INSTANCE_TYPE
Defines the default instance type that is used if the instance type is not defined in the workload object.
PODVM_INSTANCE_TYPES
Specify the allowed instance types, without spaces, for creating the pod. You can define smaller instance types for workloads that need less memory and fewer CPUs or larger instance types for larger workloads.
PODVM_AMI_ID
This value is populated when you run the KataConfig CR, using an AMI ID based on your cluster credentials. If you create your own AMI, specify the correct AMI ID.
TAGS
You can configure custom tags as key:value pairs for pod VM instances to track peer pod costs or to identify peer pods in different clusters.
PEERPODS_LIMIT_PER_NODE
You can increase this value to run more peer pods on a node. The default value is 10.
ROOT_VOLUME_SIZE
You can increase this value for pods with larger container images. Specify the root volume size in gigabytes for the pod VM. The default and minimum size is 6 GB.
Create the config map by running the following command:
```
$ oc create -f peer-pods-cm.yaml
```

3.6. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: true
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.
Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.
Verify the daemon set by running the following command:
```
$ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
```
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata-remote    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata-remote.

3.7. Modifying the number of peer pod VMs per node

You can modify the limit of peer pod virtual machines (VMs) per node by editing the peerpodConfig custom resource (CR).

Procedure

Check the current limit by running the following command:

$ oc get peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  -o jsonpath='{.spec.limit}{"\n"}'

Specify a new value for the limit key by running the following command:

$ oc patch peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  --type merge --patch '{"spec":{"limit":"<value>"}}'

3.8. Verifying the pod VM image

After kata-remote is installed on your cluster, the OpenShift sandboxed containers Operator creates a pod VM image, which is used to create peer pods. This process can take a long time because the image is created on the cloud instance. You can verify that the pod VM image was created successfully by checking the config map that you created for the cloud provider.

Procedure

Obtain the config map you created for the peer pods:

$ oc get configmap peer-pods-cm -n openshift-sandboxed-containers-operator -o yaml

Check the status stanza of the YAML file.
If the PODVM_AMI_ID parameter is populated, the pod VM image was created successfully.

Troubleshooting

Retrieve the events log by running the following command:

$ oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation

Retrieve the job log by running the following command:

$ oc logs -n openshift-sandboxed-containers-operator jobs/osc-podvm-image-creation

If you cannot resolve the issue, submit a Red Hat Support case and attach the output of both logs.

3.9. Customizing the Kata Agent policy

You can customize the Kata Agent policy to override the permissive default policy. The Kata Agent policy is a security mechanism that controls API requests for peer pods.

Important

You must override the default policy in a production environment.

As a minimum requirement, you must disable ExecProcessRequest to prevent a cluster administrator from accessing sensitive data by running the oc exec command on a peer pod.

You can use the default policy in development and test environments where security is not a concern, for example, in an environment where the control plane can be trusted.

A custom policy replaces the default policy entirely. To modify specific APIs, include the full policy and adjust the relevant rules.

Procedure

Create a custom policy.rego file by modifying the default policy:

package agent_policy

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateContainerRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := false
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default ExecProcessRequest := false
default SetPolicyRequest := false
default WriteStreamRequest := false

ExecProcessRequest if {
    input_command = concat(" ", input.process.Args)
    some allowed_command in policy_data.allowed_commands
    input_command == allowed_command
}

policy_data := {
  "allowed_commands": [
        "curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status"
  ]
}

The default policy allows all API calls. Adjust the true or false values to customize the policy further based on your needs.

Convert the policy.rego file to a Base64-encoded string by running the following command:
```
$ base64 -w0 policy.rego
```
Record the output.

Add the Base64-encoded policy string to the my-pod.yaml manifest:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  annotations:
    io.katacontainers.config.agent.policy: <base64_encoded_policy>
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <container_name>
    image: registry.access.redhat.com/ubi9/ubi:latest
    command:
    - sleep
    - "36000"
    securityContext:
      privileged: false
      seccompProfile:
        type: RuntimeDefault

Create the pod by running the following command:
```
$ oc create -f my-pod.yaml
```

3.10. Configuring a pull secret for peer pods

If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.

Then, you can link the pull secret to the default service account or you can specify the pull secret in the peer pod manifest.

Procedure

Set the NS variable to the namespace where you deploy your peer pods:
```
$ NS=<namespace>
```
Copy the pull secret to the peer pod namespace:
```
$ oc get secret pull-secret -n openshift-config -o yaml \
  | sed "s/namespace: openshift-config/namespace: ${NS}/" \
  | oc apply -n "${NS}" -f -
```
You can use the cluster pull secret, as in this example, or a custom pull secret.
Optional: Link the pull secret to the default service account:
```
$ oc secrets link default pull-secret --for=pull -n ${NS}
```

Alternatively, add the pull secret to the peer pod manifest:

apiVersion: v1
kind: <Pod>
spec:
  containers:
  - name: <container_name>
    image: <image_name>
  imagePullSecrets:
  - name: pull-secret
# ...

3.11. Selecting a custom peer pod VM image

You can select a custom peer pod virtual machine (VM) image, tailored to your workload requirements, by adding an annotation to the pod manifest. The custom image overrides the default image specified in the peer pods config map.

Prerequisites

If the custom peer pod VM image is in a private registry, you have created a pull secret.
You have the ID of a custom pod VM image, which is compatible with your cloud provider or hypervisor.

Procedure

Create a my-pod-manifest.yaml file according to the following example:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod-manifest
  annotations:
    io.katacontainers.config.hypervisor.image: "<custom_image_id>"
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <example_container>
    image: registry.access.redhat.com/ubi9/ubi:9.3
    command: ["sleep", "36000"]

Create the pod by running the following command:
```
$ oc create -f my-pod-manifest.yaml
```

3.12. Configuring your workload for OpenShift sandboxed containers

You configure your workload for OpenShift sandboxed containers by setting kata-remote as the runtime class for the following pod-templated objects:

Pod objects
ReplicaSet objects
ReplicationController objects
StatefulSet objects
Deployment objects
DeploymentConfig objects

Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

You can define whether the workload should be deployed using the default instance type, which you defined in the config map, by adding an annotation to the YAML file.

If you do not want to define the instance type manually, you can add an annotation to use an automatic instance type, based on the memory available.

Prerequisites

You have created the KataConfig custom resource (CR).

Procedure

Add spec.runtimeClassName: kata-remote to the manifest of each pod-templated workload object as in the following example:
```
apiVersion: v1
kind: <object>
# ...
spec:
  runtimeClassName: kata-remote
# ...
```
Optional: To use a manually defined instance type, add the following annotation with the instance type that you defined in the config map:
```
apiVersion: v1
kind: <object>
metadata:
  annotations:
    io.katacontainers.config.hypervisor.machine_type: <machine_type>
# ...
```
Optional: To use an automatic instance type, add the following annotations:
```
apiVersion: v1
kind: <Pod>
metadata:
  annotations:
    io.katacontainers.config.hypervisor.default_vcpus: <vcpus>
    io.katacontainers.config.hypervisor.default_memory: <memory>
# ...
```
The workload will run on an automatic instance type based on the amount of memory available.
Apply the changes to the workload object by running the following command:
```
$ oc apply -f <object.yaml>
```
OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata-remote, then the workload is running on OpenShift sandboxed containers.

Chapter 4. Deploying OpenShift sandboxed containers on Microsoft Azure

You can deploy OpenShift sandboxed containers on Microsoft Azure.

4.1. Preparation

Review the following prerequisites and concepts before you deploy OpenShift sandboxed containers on Azure.

4.1.1. Prerequisites

You have installed the latest version of Red Hat OpenShift Container Platform.
Your OpenShift Container Platform cluster has at least one worker node.
You have enabled ports 15150 and 9000 for communication in the subnet used for worker nodes and the pod virtual machine (VM). The ports enable communication between the Kata shim running on the worker node and the Kata agent running on the pod VM.

Additional resources

This content is not included.Installing OpenShift Container Platform on Azure

4.1.2. Outbound connections

To enable peer pods to communicate with external networks, such as the public internet, you must configure outbound connectivity for the pod virtual machine (VM) subnet. This involves setting up a NAT gateway and, optionally, defining how the subnet integrates with your cluster’s virtual network (VNet) in Azure.

Peer pods and subnets: Peer pods operate in a dedicated Azure subnet that requires explicit configuration for outbound access. This subnet can either be the default worker subnet used by OpenShift Container Platform nodes or a separate, custom subnet created specifically for peer pods.
VNet peering: When using a separate subnet, VNet peering connects the peer pod VNet to the cluster’s VNet, ensuring internal communication while maintaining isolation. This requires non-overlapping CIDR ranges between the VNets.

You can configure outbound connectivity in two ways:

Default worker subnet: Modify the existing worker subnet to include a NAT gateway. This is simpler and reuses cluster resources, but it offers less isolation.
Peer pod VNet: Set up a dedicated VNet and subnet for peer pods, attach a NAT gateway, and peer it with the cluster VNet. This provides greater isolation and flexibility at the cost of additional complexity.

4.1.3. Peer pod resource requirements

You must ensure that your cluster has sufficient resources.

Peer pod virtual machines (VMs) require resources in two locations:

The worker node. The worker node stores metadata, Kata shim resources (containerd-shim-kata-v2), remote-hypervisor resources (cloud-api-adaptor), and the tunnel setup between the worker nodes and the peer pod VM.
The cloud instance. This is the actual peer pod VM running in the cloud.

The extended resource is named kata.peerpods.io/vm, and enables the Kubernetes scheduler to handle capacity tracking and accounting.

You can edit the limit per node based on the requirements for your environment after you install the OpenShift sandboxed containers Operator.

The mutating webhook modifies a Kubernetes pod as follows:

The mutating webhook checks the pod for the expected RuntimeClassName value, specified in the TARGET_RUNTIMECLASS environment variable. If the value in the pod specification does not match the value in the TARGET_RUNTIMECLASS, the webhook exits without modifying the pod.
If the RuntimeClassName values match, the webhook makes the following changes to the pod spec:
1. The webhook removes every resource specification from the resources field of all containers and init containers in the pod.
2. The webhook adds the extended resource (kata.peerpods.io/vm) to the spec by modifying the resources field of the first container in the pod. The extended resource kata.peerpods.io/vm is used by the Kubernetes scheduler for accounting purposes.

Note

As a best practice, define a cluster-wide policy to only allow peer pod creation in specific namespaces.

4.2. Deployment overview

You deploy OpenShift sandboxed containers on Azure by performing the following steps:

Prepare your network by configuring outbound connectivity for the peer pods.
Install the OpenShift sandboxed containers Operator on the OpenShift Container Platform cluster.
Create the peer pods config map.
Optional: Create the Azure secret.
Create the KataConfig custom resource.
Optional: Modify the number of virtual machines running on each worker node.
Disable insecure options by customizing the Kata Agent policy.
Optional: If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.
Optional: You can select a custom peer pod VM image.
Configure your workload for OpenShift sandboxed containers.

4.3. Preparing your network

You must prepare your network by configuring outbound connectivity for the peer pods. You can perform this task by using one of the following methods:

Add a NAT gateway to the default worker subnet. This method is simple and reuses cluster resources, but it offers less isolation.
Create a dedicated VNet and subnet for your peer pods, attach a NAT gateway, and peer it with the cluster VNet. This method is more complex but it provides greater isolation and flexibility.

4.3.1. Configuring the default worker subnet

You can configure the default worker subnet for outbound connections by attaching a NAT gateway. This method is simple and reuses cluster resources, but it offers less isolation than a dedicated virtual network.

Prerequisites

The Azure CLI (az) is installed and authenticated.
You have administrator access to the Azure resource group and the VNet.

Procedure

Set the AZURE_RESOURCE_GROUP environment variable by running the following command:

$ AZURE_RESOURCE_GROUP=$(oc get infrastructure/cluster \
    -o jsonpath='{.status.platformStatus.azure.resourceGroupName}')

Set the AZURE_REGION environment variable by running the following command:

$ AZURE_REGION=$(az group show --resource-group ${AZURE_RESOURCE_GROUP}\
    --query "{Location:location}" --output tsv) && \
    echo "AZURE_REGION: \"$AZURE_REGION\""

Set the AZURE_VNET_NAME environment variable by running the following command:

$ AZURE_VNET_NAME=$(az network vnet list \
    -g "${AZURE_RESOURCE_GROUP}" --query '[].name' -o tsv)

Set the AZURE_SUBNET_ID environment variable by running the following command:

$ AZURE_SUBNET_ID=$(az network vnet subnet list \
    --resource-group "${AZURE_RESOURCE_GROUP}" \
    --vnet-name "${AZURE_VNET_NAME}" --query "[].{Id:id} \
    | [? contains(Id, 'worker')]" --output tsv)

Set the NAT gateway environment variables for the peer pod subnet by running the following commands:
```
$ export PEERPOD_NAT_GW=peerpod-nat-gw
```
```
$ export PEERPOD_NAT_GW_IP=peerpod-nat-gw-ip
```

Create a public IP address for the NAT gateway by running the following command:

$ az network public-ip create -g "${AZURE_RESOURCE_GROUP}" \
    -n "${PEERPOD_NAT_GW_IP}" -l "${AZURE_REGION}" --sku Standard

Create the NAT gateway and associate it with the public IP address by running the following command:

$ az network nat gateway create -g "${AZURE_RESOURCE_GROUP}" \
    -l "${AZURE_REGION}" --public-ip-addresses "${PEERPOD_NAT_GW_IP}" \
    -n "${PEERPOD_NAT_GW}"

Update the VNet subnet to use the NAT gateway by running the following command:
```
$ az network vnet subnet update --nat-gateway "${PEERPOD_NAT_GW}" \
    --ids "${AZURE_SUBNET_ID}"
```

Verification

Confirm the NAT gateway is attached to the VNet subnet by running the following command:
```
$ az network vnet subnet show --ids "${AZURE_SUBNET_ID}" \
    --query "natGateway.id" -o tsv
```
The output contains the NAT gateway resource ID. If no NAT gateway is attached, the output is empty.
Example output
```
/subscriptions/12345678-1234-1234-1234-1234567890ab/resourceGroups/myResourceGroup/providers/Microsoft.Network/natGateways/myNatGateway
```

Additional Resources

4.3.2. Creating a dedicated peer pod virtual network

You can configure outbound connections for peer pods by creating a dedicated virtual network (VNet). Then, you create a network address translation (NAT) gateway for the VNet, create a subnet within the VNet, and enable VNet peering with non-overlapping address spaces.

This method is more complex than creating a NAT gateway for the default worker subnet but it provides greater isolation and flexibility.

Prerequisites

The Azure CLI (az) is installed
You have signed in to Azure. See Content from learn.microsoft.com is not included.Authenticate to Azure using Azure CLI.
You have administrator access to the Azure resource group and VNet hosting the cluster.
You have verified the cluster VNet classless inter-domain routing (CIDR) address. The default value is 10.0.0.0/14. If you overrode the default value, you have ensured that you chose a non-overlapping CIDR address for the peer pod VNet. For example, 192.168.0.0/16.

Procedure

Set the environmental variables for the peer pod network:
1. Set the peer pod VNet environment variables by running the following commands:
```
$ export PEERPOD_VNET_NAME="${PEERPOD_VNET_NAME:-peerpod-vnet}"
```
```
$ export PEERPOD_VNET_CIDR="${PEERPOD_VNET_CIDR:-192.168.0.0/16}"
```
2. Set the peer pod subnet environment variables by running the following commands:
```
$ export PEERPOD_SUBNET_NAME="${PEERPOD_SUBNET_NAME:-peerpod-subnet}"
```
```
$ export PEERPOD_SUBNET_CIDR="${PEERPOD_SUBNET_CIDR:-192.168.0.0/16}"
```

Set the environmental variables for Azure:

$ AZURE_RESOURCE_GROUP=$(oc get infrastructure/cluster \
    -o jsonpath='{.status.platformStatus.azure.resourceGroupName}')

$ AZURE_REGION=$(az group show --resource-group ${AZURE_RESOURCE_GROUP}\
    --query "{Location:location}" --output tsv) && \
    echo "AZURE_REGION: \"$AZURE_REGION\""

$ AZURE_VNET_NAME=$(az network vnet list \
    -g "${AZURE_RESOURCE_GROUP}" --query '[].name' -o tsv)

Set the peer pod NAT gateway environment variables by running the following commands:

$ export PEERPOD_NAT_GW="${PEERPOD_NAT_GW:-peerpod-nat-gw}"

$ export PEERPOD_NAT_GW_IP="${PEERPOD_NAT_PUBLIC_IP:-peerpod-nat-gw-ip}"

Configure the VNET:

Create the peer pod VNet by running the following command:

$ az network vnet create --resource-group "${AZURE_RESOURCE_GROUP}" \
    --name "${PEERPOD_VNET_NAME}" \
    --address-prefixes "${PEERPOD_VNET_CIDR}"

Create a public IP address for the peer pod VNet by running the following command:

$ az network public-ip create -g "${AZURE_RESOURCE_GROUP}" \
    -n "${PEERPOD_NAT_GW_IP}" -l "${AZURE_REGION}"

Create a NAT gateway for the peer pod VNet by running the following command:

$ az network nat gateway create -g "${AZURE_RESOURCE_GROUP}" \
    -l "${AZURE_REGION}" \
    --public-ip-addresses "${PEERPOD_NAT_GW_IP}" \
    -n "${PEERPOD_NAT_GW}"

Create a subnet in the peer pod VNet and attach the NAT gateway by running the following command:

$ az network vnet subnet create \
    --resource-group "${AZURE_RESOURCE_GROUP}" \
    --vnet-name "${PEERPOD_VNET_NAME}" \
    --name "${PEERPOD_SUBNET_NAME}" \
    --address-prefixes "${PEERPOD_SUBNET_CIDR}" \
    --nat-gateway "${PEERPOD_NAT_GW}"

Configure the virtual network peering connection:

Create the peering connection by running the following command:

$ az network vnet peering create -g "${AZURE_RESOURCE_GROUP}" \
    -n peerpod-azure-vnet-to-peerpod-vnet \
    --vnet-name "${AZURE_VNET_NAME}" \
    --remote-vnet "${PEERPOD_VNET_NAME}" --allow-vnet-access \
    --allow-forwarded-traffic

Sync the peering connection by running the following command:

$ az network vnet peering sync -g "${AZURE_RESOURCE_GROUP}" \
    -n peerpod-azure-vnet-to-peerpod-vnet \
    --vnet-name "${AZURE_VNET_NAME}"

Complete the peering connection by running the following command:

$ az network vnet peering create -g "${AZURE_RESOURCE_GROUP}" \
    -n peerpod-peerpod-vnet-to-azure-vnet \
    --vnet-name "${PEERPOD_VNET_NAME}" \
    --remote-vnet "${AZURE_VNET_NAME}" --allow-vnet-access \
    --allow-forwarded-traffic

Verification

Check the peering connection status from the cluster VNet by running the following command:

$ az network vnet peering show -g "${AZURE_RESOURCE_GROUP}" \
    -n peerpod-azure-vnet-to-peerpod-vnet \
    --vnet-name "${AZURE_VNET_NAME}" \
    --query "peeringState" -o tsv

This should return Connected.

Verify that the NAT gateway is attached to the peer pod subnet by running the following command:

$ az network vnet subnet show --resource-group "${AZURE_RESOURCE_GROUP}" \
    --vnet-name "${PEERPOD_VNET_NAME}" --name "${PEERPOD_SUBNET_NAME}" \
    --query "natGateway.id" -o tsv

Additional Resources

Azure documentation: Content from docs.microsoft.com is not included.NAT Gateway Overview
Azure documentation: Content from docs.microsoft.com is not included.VNet Peering Overview

4.4. Installing the OpenShift sandboxed containers Operator

You install the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Prerequisites

You have access to the cluster as a user with the cluster-admin role.

Procedure

Create an osc-namespace.yaml manifest file:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-sandboxed-containers-operator

Create the namespace by running the following command:
```
$ oc apply -f osc-namespace.yaml
```

Create an osc-operatorgroup.yaml manifest file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: sandboxed-containers-operator-group
  namespace: openshift-sandboxed-containers-operator
spec:
  targetNamespaces:
  - openshift-sandboxed-containers-operator

Create the operator group by running the following command:
```
$ oc apply -f osc-operatorgroup.yaml
```

Create an osc-subscription.yaml manifest file:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sandboxed-containers-operator
  namespace: openshift-sandboxed-containers-operator
spec:
  channel: stable
  installPlanApproval: Automatic
  name: sandboxed-containers-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: sandboxed-containers-operator.v1.11.1

Create the subscription by running the following command:
```
$ oc create -f osc-subscription.yaml
```
Verify that the Operator is correctly installed by running the following command:
```
$ oc get csv -n openshift-sandboxed-containers-operator
```
This command can take several minutes to complete.

Watch the process by running the following command:

$ watch oc get csv -n openshift-sandboxed-containers-operator

Example output

NAME                             DISPLAY                                  VERSION             REPLACES                   PHASE
openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.11.1    1.11.0        Succeeded

4.5. Creating the peer pods config map

You must create the peer pods config map.

Procedure

Obtain the following values from your Azure instance:

Retrieve and record the Azure resource group:

$ AZURE_RESOURCE_GROUP=$(oc get infrastructure/cluster \
  -o jsonpath='{.status.platformStatus.azure.resourceGroupName}') \
  && echo "AZURE_RESOURCE_GROUP: \"$AZURE_RESOURCE_GROUP\""

Retrieve and record the Azure VNet name:

$ AZURE_VNET_NAME=$(az network vnet list \
  --resource-group ${AZURE_RESOURCE_GROUP} \
  --query "[].{Name:name}" --output tsv)

This value is used to retrieve the Azure subnet ID.

Retrieve and record the Azure subnet ID:

$ AZURE_SUBNET_ID=$(az network vnet subnet list \
  --resource-group ${AZURE_RESOURCE_GROUP} --vnet-name $AZURE_VNET_NAME \
  --query "[].{Id:id} | [? contains(Id, 'worker')]" --output tsv) \
   && echo "AZURE_SUBNET_ID: \"$AZURE_SUBNET_ID\""

Retrieve and record the Azure network security group (NSG) ID:

$ AZURE_NSG_ID=$(az network nsg list --resource-group ${AZURE_RESOURCE_GROUP} \
  --query "[].{Id:id}" --output tsv) && echo "AZURE_NSG_ID: \"$AZURE_NSG_ID\""

Retrieve and record the Azure region:

$ AZURE_REGION=$(az group show --resource-group ${AZURE_RESOURCE_GROUP} \
  --query "{Location:location}" --output tsv) \
  && echo "AZURE_REGION: \"$AZURE_REGION\""

Create a peer-pods-cm.yaml manifest file according to the following example:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: peer-pods-cm
  namespace: openshift-sandboxed-containers-operator
data:
  CLOUD_PROVIDER: "azure"
  VXLAN_PORT: "9000"
  PROXY_TIMEOUT: "5m"
  AZURE_INSTANCE_SIZE: "Standard_B2als_v2"
  AZURE_INSTANCE_SIZES: "Standard_B2als_v2,Standard_D2as_v5,Standard_D4as_v5,Standard_D2ads_v5"
  AZURE_SUBNET_ID: "<azure_subnet_id>"
  AZURE_NSG_ID: "<azure_nsg_id>"
  AZURE_IMAGE_ID: ""
  AZURE_REGION: "<azure_region>"
  AZURE_RESOURCE_GROUP: "<azure_resource_group>"
  TAGS: "key1=value1,key2=value2"
  PEERPODS_LIMIT_PER_NODE: "10"
  ROOT_VOLUME_SIZE: "6"
  DISABLECVM: "true"
```
AZURE_INSTANCE_SIZE
Defines the default instance size that is used if the instance size is not defined in the workload object.
AZURE_IMAGE_ID
Leave this value empty. When you install the Operator, a Job is scheduled to download the default pod VM image from the Red Hat Ecosystem Catalog and upload it to the Azure Image Gallery within the same Azure Resource Group as the OpenShift Container Platform cluster.
AZURE_INSTANCE_SIZES
Specify the allowed instance sizes, without spaces, for creating the pod. You can define smaller instance sizes for workloads that need less memory and fewer CPUs or larger instance sizes for larger workloads.
TAGS
You can configure custom tags as key:value pairs for pod VM instances to track peer pod costs or to identify peer pods in different clusters.
PEERPODS_LIMIT_PER_NODE
You can increase this value to run more peer pods on a node. The default value is 10.
ROOT_VOLUME_SIZE
You can increase this value for pods with larger container images. Specify the root volume size in gigabytes for the pod VM. The default and minimum size is 6 GB.
Create the config map by running the following command:
```
$ oc create -f peer-pods-cm.yaml
```

4.6. Creating the Azure secret

You must create the SSH key secret, which is required by the Azure virtual machine (VM) creation API. Azure only requires the SSH public key. OpenShift sandboxed containers disables SSH in VMs, so the keys have no effect in the VMs.

Procedure

Generate an SSH key pair by running the following command:
```
$ ssh-keygen -f ./id_rsa -N ""
```

Create the Secret object by running the following command:

$ oc create secret generic ssh-key-secret \
  -n openshift-sandboxed-containers-operator \
  --from-file=id_rsa.pub=./id_rsa.pub \
  --from-file=id_rsa=./id_rsa

Delete the SSH keys you created:
```
$ shred --remove id_rsa.pub id_rsa
```

4.7. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: true
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.
Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.
Verify the daemon set by running the following command:
```
$ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
```
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata-remote    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata-remote.

4.8. Modifying the number of peer pod VMs per node

You can modify the limit of peer pod virtual machines (VMs) per node by editing the peerpodConfig custom resource (CR).

Procedure

Check the current limit by running the following command:

$ oc get peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  -o jsonpath='{.spec.limit}{"\n"}'

Specify a new value for the limit key by running the following command:

$ oc patch peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  --type merge --patch '{"spec":{"limit":"<value>"}}'

4.9. Verifying the pod VM image

Procedure

Obtain the config map you created for the peer pods:

$ oc get configmap peer-pods-cm -n openshift-sandboxed-containers-operator -o yaml

Check the status stanza of the YAML file.
If the AZURE_IMAGE_ID parameter is populated, the pod VM image was created successfully.

Troubleshooting

Retrieve the events log by running the following command:

$ oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation

Retrieve the job log by running the following command:

$ oc logs -n openshift-sandboxed-containers-operator jobs/osc-podvm-image-creation

If you cannot resolve the issue, submit a Red Hat Support case and attach the output of both logs.

4.10. Customizing the Kata Agent policy

You can customize the Kata Agent policy to override the permissive default policy. The Kata Agent policy is a security mechanism that controls API requests for peer pods.

Important

You must override the default policy in a production environment.

As a minimum requirement, you must disable ExecProcessRequest to prevent a cluster administrator from accessing sensitive data by running the oc exec command on a peer pod.

You can use the default policy in development and test environments where security is not a concern, for example, in an environment where the control plane can be trusted.

A custom policy replaces the default policy entirely. To modify specific APIs, include the full policy and adjust the relevant rules.

Procedure

Create a custom policy.rego file by modifying the default policy:

package agent_policy

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateContainerRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := false
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default ExecProcessRequest := false
default SetPolicyRequest := false
default WriteStreamRequest := false

ExecProcessRequest if {
    input_command = concat(" ", input.process.Args)
    some allowed_command in policy_data.allowed_commands
    input_command == allowed_command
}

policy_data := {
  "allowed_commands": [
        "curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status"
  ]
}

The default policy allows all API calls. Adjust the true or false values to customize the policy further based on your needs.

Convert the policy.rego file to a Base64-encoded string by running the following command:
```
$ base64 -w0 policy.rego
```
Record the output.

Add the Base64-encoded policy string to the my-pod.yaml manifest:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  annotations:
    io.katacontainers.config.agent.policy: <base64_encoded_policy>
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <container_name>
    image: registry.access.redhat.com/ubi9/ubi:latest
    command:
    - sleep
    - "36000"
    securityContext:
      privileged: false
      seccompProfile:
        type: RuntimeDefault

Create the pod by running the following command:
```
$ oc create -f my-pod.yaml
```

4.11. Configuring a pull secret for peer pods

If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.

Then, you can link the pull secret to the default service account or you can specify the pull secret in the peer pod manifest.

Procedure

Set the NS variable to the namespace where you deploy your peer pods:
```
$ NS=<namespace>
```
Copy the pull secret to the peer pod namespace:
```
$ oc get secret pull-secret -n openshift-config -o yaml \
  | sed "s/namespace: openshift-config/namespace: ${NS}/" \
  | oc apply -n "${NS}" -f -
```
You can use the cluster pull secret, as in this example, or a custom pull secret.
Optional: Link the pull secret to the default service account:
```
$ oc secrets link default pull-secret --for=pull -n ${NS}
```

Alternatively, add the pull secret to the peer pod manifest:

apiVersion: v1
kind: <Pod>
spec:
  containers:
  - name: <container_name>
    image: <image_name>
  imagePullSecrets:
  - name: pull-secret
# ...

4.12. Selecting a custom peer pod VM image

Prerequisites

If the custom peer pod VM image is in a private registry, you have created a pull secret.
You have the ID of a custom pod VM image, which is compatible with your cloud provider or hypervisor.

Procedure

Create a my-pod-manifest.yaml file according to the following example:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod-manifest
  annotations:
    io.katacontainers.config.hypervisor.image: "<custom_image_id>"
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <example_container>
    image: registry.access.redhat.com/ubi9/ubi:9.3
    command: ["sleep", "36000"]

Create the pod by running the following command:
```
$ oc create -f my-pod-manifest.yaml
```

4.13. Configuring your workload for OpenShift sandboxed containers

You configure your workload for OpenShift sandboxed containers by setting kata-remote as the runtime class for the following pod-templated objects:

Pod objects
ReplicaSet objects
ReplicationController objects
StatefulSet objects
Deployment objects
DeploymentConfig objects

Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

You can define whether the workload should be deployed using the default instance size, which you defined in the config map, by adding an annotation to the YAML file.

If you do not want to define the instance size manually, you can add an annotation to use an automatic instance size, based on the memory available.

Prerequisites

You have created the KataConfig custom resource (CR).

Procedure

Add spec.runtimeClassName: kata-remote to the manifest of each pod-templated workload object as in the following example:
```
apiVersion: v1
kind: <object>
# ...
spec:
  runtimeClassName: kata-remote
# ...
```
Optional: To use a manually defined instance size, add the following annotation with the instance size that you defined in the config map:
```
apiVersion: v1
kind: <object>
metadata:
  annotations:
    io.katacontainers.config.hypervisor.machine_type: <machine_type>
# ...
```
Optional: To use an automatic instance size, add the following annotations:
```
apiVersion: v1
kind: <Pod>
metadata:
  annotations:
    io.katacontainers.config.hypervisor.default_vcpus: <vcpus>
    io.katacontainers.config.hypervisor.default_memory: <memory>
# ...
```
The workload will run on an automatic instance size based on the amount of memory available.
Apply the changes to the workload object by running the following command:
```
$ oc apply -f <object.yaml>
```
OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata-remote, then the workload is running on OpenShift sandboxed containers.

Chapter 5. Deploying OpenShift sandboxed containers on Google Cloud

You can deploy OpenShift sandboxed containers on Google Cloud.

Important

Red Hat OpenShift sandboxed containers on Google Cloud is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

5.1. Preparation

Review the following prerequisites and concepts before you deploy OpenShift sandboxed containers on Google Cloud.

5.1.1. Prerequisites

You have installed the latest version of Red Hat OpenShift Container Platform.
Your OpenShift Container Platform cluster has at least one worker node.
You have enabled ports 15150 and 9000 for communication in the subnet used for worker nodes and the pod virtual machine (VM). The ports enable communication between the Kata shim running on the worker node and the Kata agent running on the pod VM.

Additional resources

This content is not included.Installing OpenShift Container Platform on Google Cloud

5.1.2. Peer pod resource requirements

You must ensure that your cluster has sufficient resources.

Peer pod virtual machines (VMs) require resources in two locations:

The worker node. The worker node stores metadata, Kata shim resources (containerd-shim-kata-v2), remote-hypervisor resources (cloud-api-adaptor), and the tunnel setup between the worker nodes and the peer pod VM.
The cloud instance. This is the actual peer pod VM running in the cloud.

The extended resource is named kata.peerpods.io/vm, and enables the Kubernetes scheduler to handle capacity tracking and accounting.

You can edit the limit per node based on the requirements for your environment after you install the OpenShift sandboxed containers Operator.

The mutating webhook modifies a Kubernetes pod as follows:

The mutating webhook checks the pod for the expected RuntimeClassName value, specified in the TARGET_RUNTIMECLASS environment variable. If the value in the pod specification does not match the value in the TARGET_RUNTIMECLASS, the webhook exits without modifying the pod.
If the RuntimeClassName values match, the webhook makes the following changes to the pod spec:
1. The webhook removes every resource specification from the resources field of all containers and init containers in the pod.
2. The webhook adds the extended resource (kata.peerpods.io/vm) to the spec by modifying the resources field of the first container in the pod. The extended resource kata.peerpods.io/vm is used by the Kubernetes scheduler for accounting purposes.

Note

As a best practice, define a cluster-wide policy to only allow peer pod creation in specific namespaces.

5.2. Deployment overview

You deploy OpenShift sandboxed containers on Google Cloud by performing the following steps:

Install the OpenShift sandboxed containers Operator on the OpenShift Container Platform cluster.
Enable ports to allow internal communication with peer pods.
Create the peer pods config map.
Create the pod VM image config map.
Create the KataConfig custom resource.
Optional: Modify the number of virtual machines running on each worker node.
Disable insecure options by customizing the Kata Agent policy.
Configure your workload for OpenShift sandboxed containers.

5.3. Installing the OpenShift sandboxed containers Operator

You install the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Prerequisites

You have access to the cluster as a user with the cluster-admin role.

Procedure

Create an osc-namespace.yaml manifest file:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-sandboxed-containers-operator

Create the namespace by running the following command:
```
$ oc apply -f osc-namespace.yaml
```

Create an osc-operatorgroup.yaml manifest file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: sandboxed-containers-operator-group
  namespace: openshift-sandboxed-containers-operator
spec:
  targetNamespaces:
  - openshift-sandboxed-containers-operator

Create the operator group by running the following command:
```
$ oc apply -f osc-operatorgroup.yaml
```

Create an osc-subscription.yaml manifest file:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sandboxed-containers-operator
  namespace: openshift-sandboxed-containers-operator
spec:
  channel: stable
  installPlanApproval: Automatic
  name: sandboxed-containers-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: sandboxed-containers-operator.v1.11.1

Create the subscription by running the following command:
```
$ oc create -f osc-subscription.yaml
```
Verify that the Operator is correctly installed by running the following command:
```
$ oc get csv -n openshift-sandboxed-containers-operator
```
This command can take several minutes to complete.

Watch the process by running the following command:

$ watch oc get csv -n openshift-sandboxed-containers-operator

Example output

NAME                             DISPLAY                                  VERSION             REPLACES                   PHASE
openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.11.1    1.11.0        Succeeded

5.4. Enabling port 15150 for Google Cloud

You must enable port 15150 to allow internal communication with peer pods running on Compute Engine.

Prerequisites

You have installed the Google Cloud command line interface (CLI) tool.
You have access to the OpenShift Container Platform cluster as a user with the roles/container.admin role.

Procedure

Set the project ID variable by running the following command:
```
$ export GCP_PROJECT_ID="<project_id>"
```
Log in to Google Cloud by running the following command:
```
$ gcloud auth login
```
Set the Google Cloud project ID by running the following command:
```
$ gcloud config set project ${GCP_PROJECT_ID}
```

Open port 15150 by running the following command:

$ gcloud compute firewall-rules create allow-port-15150-restricted \
   --project=${GCP_PROJECT_ID} \
   --network=default \
   --allow=tcp:15150 \
   --source-ranges=<external_ip_cidr-1>[,<external_ip_cidr-2>,...] 1

1: Specify one or more IP addresses or ranges in CIDR format, separated by commas. For example, 203.0.113.5/32,198.51.100.0/24.

Verification

Verify that port 15150 is open by running the following command:
```
$ gcloud compute firewall-rule list
```

5.5. Creating the peer pods config map

You must create the peer pods config map.

Procedure

Log in to your Compute Engine instance to set the following environmental variables:
1. Get the project ID by running the following command:
```
$ GCP_PROJECT_ID=$(gcloud config get-value project)
```
2. Get the zone by running the following command:
```
$ GCP_ZONE=$(gcloud config get-value compute/zone)
```
3. Retrieve a list of network names by running the following command:
```
$ gcloud compute networks list --format="value(name)"
```
4. Specify the network by running the following command:
```
$ GCP_NETWORK=<network_name>
```
  Only auto-mode networks are supported. Custom networks are not supported at this time.
Create a peer-pods-cm.yaml manifest file according to the following example:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: peer-pods-cm
  namespace: openshift-sandboxed-containers-operator
data:
  CLOUD_PROVIDER: "gcp"
  VXLAN_PORT: "9000"
  PROXY_TIMEOUT: "5m"
  GCP_MACHINE_TYPE: "e2-medium"
  GCP_PROJECT_ID: "<project_id>"
  GCP_ZONE: "<gcp_zone>"
  GCP_NETWORK: "<gcp_network>"
  TAGS: "key1=value1,key2=value2"
  PEERPODS_LIMIT_PER_NODE: "10"
  ROOT_VOLUME_SIZE: "6"
  DISABLECVM: "true"
```
GCP_MACHINE_TYPE
Defines the default machine type that is used if the machine type is not defined in the workload object.
TAGS
You can configure custom tags as key:value pairs for pod VM instances to track peer pod costs or to identify peer pods in different clusters.
PEERPODS_LIMIT_PER_NODE
You can increase this value to run more peer pods on a node. The default value is 10.
ROOT_VOLUME_SIZE
You can increase this value for pods with larger container images. Specify the root volume size in gigabytes for the pod VM. The default and minimum size is 6 GB.
Create the config map by running the following command:
```
$ oc create -f peer-pods-cm.yaml
```

5.6. Creating the peer pod VM image

You must create a QCOW2 peer pod virtual machine (VM) image.

Prerequisites

You have installed podman.
You have access to a container registry.

Procedure

Clone the OpenShift sandboxed containers repository by running the following command:
```
$ git clone https://github.com/openshift/sandboxed-containers-operator.git
```
Navigate to sandboxed-containers-operator/config/peerpods/podvm/bootc by running the following command:
```
$ cd sandboxed-containers-operator/config/peerpods/podvm/bootc
```
Log in to registry.redhat.io by running the following command:
```
$ podman login registry.redhat.io
```
You must log in to registry.redhat.io, because the podman build process must access the Containerfile.rhel container image hosted on the registry.
Set the image path for your container registry by running the following command:
```
$ IMG="<container_registry_url>/<username>/podvm-bootc:latest"
```
Build the pod VM bootc image by running the following command:
```
$ podman build -t ${IMG} -f Containerfile.rhel .
```
Log in to your container registry by running the following command:
```
$ podman login <container_registry_url>
```
Push the image to your container registry by running the following command:
```
$ podman push ${IMG}
```
For testing and development, you can make the image public.

Verify the podvm-bootc image by running the following command:

$ podman images

Example output

REPOSITORY                               TAG     IMAGE ID      CREATED         SIZE
example.com/example_user/podvm-bootc     latest  88ddab975a07  2 seconds ago   1.82 GB

5.7. Creating the peer pod VM image config map

Create the config map for the pod virtual machine (VM) image.

Procedure

Create a podvm-image-cm.yaml manifest with the following content:

apiVersion: v1
kind: ConfigMap
metadata:
  name: podvm-image-cm
  namespace: openshift-sandboxed-containers-operator
data:
  IMAGE_TYPE: pre-built
  PODVM_IMAGE_URI: <container_registry_url>/<username>/podvm-bootc:latest
  IMAGE_BASE_NAME: "podvm-image"
  IMAGE_VERSION: "0-0-0"

  INSTALL_PACKAGES: "no"
  DISABLE_CLOUD_CONFIG: "true"
  UPDATE_PEERPODS_CM: "yes"
  BOOT_FIPS: "no"

  BOOTC_BUILD_CONFIG: |
    [[customizations.user]]
    name = "peerpod"
    password = "peerpod"
    groups = ["wheel", "root"]

    [[customizations.filesystem]]
    mountpoint = "/"
    minsize = "5 GiB"

    [[customizations.filesystem]]
    mountpoint = "/var/kata-containers"
    minsize = "15 GiB"

Create the config map by running the following command:
```
$ oc create -f podvm-image-cm.yaml
```

5.8. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: true
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.
Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.
Verify the daemon set by running the following command:
```
$ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
```
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata-remote    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata-remote.

5.9. Modifying the number of peer pod VMs per node

You can modify the limit of peer pod virtual machines (VMs) per node by editing the peerpodConfig custom resource (CR).

Procedure

Check the current limit by running the following command:

$ oc get peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  -o jsonpath='{.spec.limit}{"\n"}'

Specify a new value for the limit key by running the following command:

$ oc patch peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  --type merge --patch '{"spec":{"limit":"<value>"}}'

5.10. Verifying the pod VM image

Procedure

Obtain the config map you created for the peer pods:

$ oc get configmap peer-pods-cm -n openshift-sandboxed-containers-operator -o yaml

Check the status stanza of the YAML file.
If the PODVM_IMAGE_NAME parameter is populated, the pod VM image was created successfully.

Troubleshooting

Retrieve the events log by running the following command:

$ oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation

Retrieve the job log by running the following command:

$ oc logs -n openshift-sandboxed-containers-operator jobs/osc-podvm-image-creation

If you cannot resolve the issue, submit a Red Hat Support case and attach the output of both logs.

5.11. Customizing the Kata Agent policy

You can customize the Kata Agent policy to override the permissive default policy. The Kata Agent policy is a security mechanism that controls API requests for peer pods.

Important

You must override the default policy in a production environment.

As a minimum requirement, you must disable ExecProcessRequest to prevent a cluster administrator from accessing sensitive data by running the oc exec command on a peer pod.

You can use the default policy in development and test environments where security is not a concern, for example, in an environment where the control plane can be trusted.

A custom policy replaces the default policy entirely. To modify specific APIs, include the full policy and adjust the relevant rules.

Procedure

Create a custom policy.rego file by modifying the default policy:

package agent_policy

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateContainerRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := false
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default ExecProcessRequest := false
default SetPolicyRequest := false
default WriteStreamRequest := false

ExecProcessRequest if {
    input_command = concat(" ", input.process.Args)
    some allowed_command in policy_data.allowed_commands
    input_command == allowed_command
}

policy_data := {
  "allowed_commands": [
        "curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status"
  ]
}

The default policy allows all API calls. Adjust the true or false values to customize the policy further based on your needs.

Convert the policy.rego file to a Base64-encoded string by running the following command:
```
$ base64 -w0 policy.rego
```
Record the output.

Add the Base64-encoded policy string to the my-pod.yaml manifest:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  annotations:
    io.katacontainers.config.agent.policy: <base64_encoded_policy>
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <container_name>
    image: registry.access.redhat.com/ubi9/ubi:latest
    command:
    - sleep
    - "36000"
    securityContext:
      privileged: false
      seccompProfile:
        type: RuntimeDefault

Create the pod by running the following command:
```
$ oc create -f my-pod.yaml
```

5.12. Configuring your workload for OpenShift sandboxed containers

You configure your workload for OpenShift sandboxed containers by setting kata-remote as the runtime class for the following pod-templated objects:

Pod objects
ReplicaSet objects
ReplicationController objects
StatefulSet objects
Deployment objects
DeploymentConfig objects

Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

Prerequisites

You have created the KataConfig custom resource (CR).

Procedure

Add spec.runtimeClassName: kata-remote to the manifest of each pod-templated workload object as in the following example:
```
apiVersion: v1
kind: <object>
# ...
spec:
  runtimeClassName: kata-remote
# ...
```
Apply the changes to the workload object by running the following command:
```
$ oc apply -f <object.yaml>
```
OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata-remote, then the workload is running on OpenShift sandboxed containers.

Chapter 6. Deploying OpenShift sandboxed containers on IBM Z and IBM LinuxONE

You can deploy OpenShift sandboxed containers on IBM Z® and IBM® LinuxONE.

Important

OpenShift sandboxed containers on IBM Z® and IBM® LinuxONE is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

6.1. Preparation

Review the following prerequisites and concepts before you deploy OpenShift sandboxed containers on IBM Z® and IBM® LinuxONE.

6.1.1. Prerequisites

You have installed the latest version of Red Hat OpenShift Container Platform.
Your OpenShift Container Platform cluster has three control plane nodes and at least two worker nodes.
The cluster nodes and peer pods are in the same IBM Z® KVM host logical partition.
The cluster nodes and peer pods are connected to the same subnet.

Additional resources

This content is not included.Installing OpenShift Container Platform on IBM Z® and IBM® LinuxONE

6.1.2. Peer pod resource requirements

You must ensure that your cluster has sufficient resources.

Peer pod virtual machines (VMs) require resources in two locations:

The worker node. The worker node stores metadata, Kata shim resources (containerd-shim-kata-v2), remote-hypervisor resources (cloud-api-adaptor), and the tunnel setup between the worker nodes and the peer pod VM.
The libvirt virtual machine instance. This is the actual peer pod VM running in the LPAR (KVM host).

The extended resource is named kata.peerpods.io/vm, and enables the Kubernetes scheduler to handle capacity tracking and accounting.

You can edit the limit per node based on the requirements for your environment after you install the OpenShift sandboxed containers Operator.

The mutating webhook modifies a Kubernetes pod as follows:

The mutating webhook checks the pod for the expected RuntimeClassName value, specified in the TARGET_RUNTIMECLASS environment variable. If the value in the pod specification does not match the value in the TARGET_RUNTIMECLASS, the webhook exits without modifying the pod.
If the RuntimeClassName values match, the webhook makes the following changes to the pod spec:
1. The webhook removes every resource specification from the resources field of all containers and init containers in the pod.
2. The webhook adds the extended resource (kata.peerpods.io/vm) to the spec by modifying the resources field of the first container in the pod. The extended resource kata.peerpods.io/vm is used by the Kubernetes scheduler for accounting purposes.

Note

As a best practice, define a cluster-wide policy to only allow peer pod creation in specific namespaces.

6.2. Deployment overview

You deploy OpenShift sandboxed containers on IBM Z® and IBM® LinuxONE by performing the following steps:

Install the OpenShift sandboxed containers Operator on the OpenShift Container Platform cluster.
Optional: Configure the libvirt volume.
Optional: Create a custom peer pod VM image.
Create the peer pods secret.
Create the peer pods config map.
Create the pod VM image config map.
Create the KVM host secret.
Create the KataConfig custom resource.
Optional: Modify the number of virtual machines running on each worker node.
Disable insecure options by customizing the Kata Agent policy.
Optional: If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.
Optional: You can select a custom peer pod VM image.
Configure your workload for OpenShift sandboxed containers.

6.3. Installing and upgrading the OpenShift sandboxed containers Operator

You can install or upgrade the OpenShift sandboxed containers Operator by using the command line interface (CLI).

Note

You must configure the OpenShift sandboxed containers Operator subscription for manual updates by setting the value of installPlanApproval to Manual. Automatic updates are not supported.

Prerequisites

You have access to the cluster as a user with the cluster-admin role.

Procedure

Create an osc-namespace.yaml manifest file:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-sandboxed-containers-operator

Create the namespace by running the following command:
```
$ oc apply -f osc-namespace.yaml
```

Create an osc-operatorgroup.yaml manifest file:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: sandboxed-containers-operator-group
  namespace: openshift-sandboxed-containers-operator
spec:
  targetNamespaces:
  - openshift-sandboxed-containers-operator

Create the operator group by running the following command:
```
$ oc apply -f osc-operatorgroup.yaml
```

Create an osc-subscription.yaml manifest file:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sandboxed-containers-operator
  namespace: openshift-sandboxed-containers-operator
spec:
  channel: stable
  installPlanApproval: Manual
  name: sandboxed-containers-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: sandboxed-containers-operator.v1.11.1

Create the subscription by running the following command:
```
$ oc create -f osc-subscription.yaml
```

Get the InstallPlan CR for the OpenShift sandboxed containers Operator by running the following command:

$ oc get installplan -n openshift-sandboxed-containers-operator

Installation example output

NAME            CSV                                      APPROVAL  APPROVED
install-bl4fl   sandboxed-containers-operator.v1.11.1    Manual    false

Upgrade example output

NAME            CSV                                     APPROVAL   APPROVED
install-jdzrb   sandboxed-containers-operator.v1.11.1   Manual     false
install-pfk8l   sandboxed-containers-operator.v1.11.0   Manual     true

Approve the manual installation by running the following command:
```
$ oc patch installplan <installplan_name> -p '{"spec":{"approved":true}}' --type=merge -n openshift-sandboxed-containers-operator
```
<installplan_name>
Specify the InstallPlan resource. For example, install-jdzrb.
Verify that the Operator is correctly installed by running the following command:
```
$ oc get csv -n openshift-sandboxed-containers-operator
```
This command can take several minutes to complete.

Watch the process by running the following command:

$ watch oc get csv -n openshift-sandboxed-containers-operator

Example output

NAME                             DISPLAY                                  VERSION             REPLACES                   PHASE
openshift-sandboxed-containers   openshift-sandboxed-containers-operator  1.11.1    1.11.0        Succeeded

6.4. Configuring the libvirt volume

The OpenShift sandboxed containers Operator configures the libvirt volume and pool on your KVM host automatically during installation. If required, you can manually configure or create additional libvirt volumes and pools.

Prerequisites

You have installed the OpenShift sandboxed containers Operator on your OpenShift Container Platform cluster by using the OpenShift Container Platform web console or the command line.
You have administrator privileges for your KVM host.
You have installed podman on your KVM host.
You have installed virt-customize on your KVM host.
You have a /var/lib/libvirt/images/ directory for your images.

Procedure

Log in to the KVM host.
Set the name of the libvirt pool by running the following command:
```
$ export LIBVIRT_POOL=<libvirt_pool>
```
You need the LIBVIRT_POOL value to create the secret for the libvirt provider.
Set the name of the libvirt volume by running the following command:
```
$ export LIBVIRT_VOL_NAME=<libvirt_volume>
```
You need the LIBVIRT_VOL_NAME value to create the secret for the libvirt provider.
Set the path of the default storage pool location, by running the following command:
```
$ export LIBVIRT_POOL_DIRECTORY="/var/lib/libvirt/images/"
```

Create a libvirt pool by running the following command:

$ virsh pool-define-as $LIBVIRT_POOL --type dir --target "$LIBVIRT_POOL_DIRECTORY"

Start the libvirt pool by running the following command:
```
$ virsh pool-start $LIBVIRT_POOL
```

Create a libvirt volume for the pool by running the following command:

$ virsh -c qemu:///system \
  vol-create-as --pool $LIBVIRT_POOL \
  --name $LIBVIRT_VOL_NAME \
  --capacity 20G \
  --allocation 2G \
  --prealloc-metadata \
  --format qcow2

6.5. Creating a custom peer pod VM image

You can create a custom peer pod virtual machine (VM) image instead of using the default Operator-built image.

You build an Open Container Initiative (OCI) container with the peer pod QCOW2 image. Later, you add the container registry URL and the image path to the peer pod VM image config map.

Procedure

Create a Dockerfile.podvm-oci file:

FROM scratch

ARG PODVM_IMAGE_SRC
ENV PODVM_IMAGE_PATH="/image/podvm.qcow2"

COPY $PODVM_IMAGE_SRC $PODVM_IMAGE_PATH

Build a container with the pod VM QCOW2 image by running the following command:
```
$ docker build -t podvm-libvirt \
  --build-arg PODVM_IMAGE_SRC=<podvm_image_source> \ 1
  --build-arg PODVM_IMAGE_PATH=<podvm_image_path> \ 2
  -f Dockerfile.podvm-oci .
```
1
Specify the QCOW2 image source on the host.
2
Optional: Specify the path of the QCOW2 image if you do not use the default, /image/podvm.qcow2.

6.6. Creating the peer pods secret

You must create a peer pods secret. The secret stores credentials for creating the pod virtual machine (VM) image and peer pod instances.

Prerequisites

LIBVIRT_URI. This value is the default gateway IP address of the libvirt network. Check your libvirt network setup to obtain this value.
Note
If libvirt uses the default bridge virtual network, you can obtain the LIBVIRT_URI by running the following commands:
```
$ virtint=$(bridge_line=$(virsh net-info default | grep Bridge);  echo "${bridge_line//Bridge:/}" | tr -d [:blank:])

$ LIBVIRT_URI=$( ip -4 addr show $virtint | grep -oP '(?<=inet\s)\d+(\.\d+){3}')

$ LIBVIRT_GATEWAY_URI="qemu+ssh://root@${LIBVIRT_URI}/system?no_verify=1"
```
REDHAT_OFFLINE_TOKEN. You have generated this token to download the RHEL image at This content is not included.Red Hat API Tokens.

Procedure

Create a peer-pods-secret.yaml manifest file according to the following example:

apiVersion: v1
kind: Secret
metadata:
  name: peer-pods-secret
  namespace: openshift-sandboxed-containers-operator
type: Opaque
stringData:
  CLOUD_PROVIDER: "libvirt"
  LIBVIRT_URI: "<libvirt_gateway_uri>" 1
  REDHAT_OFFLINE_TOKEN: "<rh_offline_token>" 2

1: Specify the libvirt URI.
2: Specify the Red Hat offline token, which is required for the Operator-built image.

Create the secret by running the following command:
```
$ oc create -f peer-pods-secret.yaml
```

6.7. Creating the peer pods config map

You must create the peer pods config map.

Procedure

Create a peer-pods-cm.yaml manifest file according to the following example:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: peer-pods-cm
  namespace: openshift-sandboxed-containers-operator
data:
  CLOUD_PROVIDER: "libvirt"
  LIBVIRT_POOL: "<libvirt_pool>"
  LIBVIRT_VOL_NAME: "<libvirt_volume>"
  LIBVIRT_DIR_NAME: "/var/lib/libvirt/images/<directory_name>"
  LIBVIRT_NET: "default"
  PEERPODS_LIMIT_PER_NODE: "10"
  ROOT_VOLUME_SIZE: "6"
  DISABLECVM: "true"
```
LIBVIRT_POOL
If you have manually configured the libvirt pool, use the same name as in your KVM host configuration.
LIBVIRT_VOL_NAME
If you have manually configured the libvirt volume, use the same name as in your KVM host configuration.
LIBVIRT_DIR_NAME
Specify the libvirt directory for storing virtual machine disk images, such as .qcow2, or .raw files. To ensure libvirt has read and write access permissions, use a subdirectory of the libvirt storage directory. The default is /var/lib/libvirt/images/.
LIBVIRT_NET
Specify a libvirt network if you do not want to use the default network.
PEERPODS_LIMIT_PER_NODE
You can increase this value to run more peer pods on a node. The default value is 10.
ROOT_VOLUME_SIZE
You can increase this value for pods with larger container images. Specify the root volume size in gigabytes for the pod VM. The default and minimum size is 6 GB.
Create the config map by running the following command:
```
$ oc create -f peer-pods-cm.yaml
```

6.8. Creating the peer pod VM image config map

You must create a config map for the peer pod virtual machine (VM) image.

Prerequisites

You must create an activation key by using the This content is not included.Red Hat Hybrid Cloud Console.
Optional: If you want to use a Cloud API Adaptor custom image, you must have the name, URL, and the branch or tag of the image.

Procedure

Create a libvirt-podvm-image-cm.yaml manifest according to the following example:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: libvirt-podvm-image-cm
  namespace: openshift-sandboxed-containers-operator
data:
  PODVM_DISTRO: "rhel"
  DOWNLOAD_SOURCES: "no" 1
  CAA_SRC: "https://github.com/confidential-containers/cloud-api-adaptor" 2
  CAA_REF: "main" 3
  CONFIDENTIAL_COMPUTE_ENABLED: "yes"
  UPDATE_PEERPODS_CM: "yes"
  ORG_ID: "<rhel_organization_id>"
  ACTIVATION_KEY: "<rhel_activation_key>" 4
  PODVM_IMAGE_URI: "oci::<image_repo_url>:<image_tag>::<image_path>" 5
  SE_BOOT: "true" 6
  BASE_OS_VERSION: "<rhel_image_os_version>" 7
  SE_VERIFY: "false" 8
```
1
Specify yes if you want to use the custom Cloud API Adaptor source to build the pod VM image.
2
Optional: Specify the URL of the Cloud API Adaptor custom image.
3
Optional: Specify the branch or tag of the Cloud API Adaptor custom image.
4
Specify your RHEL activation key.
5
Optional: If you created a custom peer pod VM image, specify the container registry URL, the image tag, and the image path (default: /image/podvm.qcow2). Otherwise, set the value to "".
6
The default value, true, enables IBM Secure Execution for the default Operator-built image. If you use a custom peer pod VM image, set it to false .
7
Specify the RHEL image operating system version. IBM Z® Secure Execution supports RHEL 10.1 and later versions.
8
Specify false if you do not want to verify Secure Execution with the digicert CA certificate. The default value is true.
Create the config map by running the following command:
```
$ oc create -f libvirt-podvm-image-cm.yaml
```
The libvirt pod VM image config map is created for your libvirt provider.

6.9. Creating the KVM host secret

You must create the secret for your KVM host.

Procedure

Generate an SSH key pair by running the following command:
```
$ ssh-keygen -f ./id_rsa -N ""
```
Copy the public SSH key to your KVM host:
```
$ ssh-copy-id -i ./id_rsa.pub <LIBVIRT_IP> 1
```
1
Specify the libvirt IP address of your KVM host or the LPAR where the peer pod VM is running. For example, 192.168.122.1.

Create the Secret object by running the following command:

$ oc create secret generic ssh-key-secret \
  -n openshift-sandboxed-containers-operator \
  --from-file=id_rsa.pub=./id_rsa.pub \
  --from-file=id_rsa=./id_rsa

Delete the SSH keys you created:
```
$ shred --remove id_rsa.pub id_rsa
```

6.10. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: true
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.
Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.

Verify that you have built the peer pod image and uploaded it to the libvirt volume by running the following command:

$ oc describe configmap peer-pods-cm -n openshift-sandboxed-containers-operator

Example output

Name: peer-pods-cm
Namespace: openshift-sandboxed-containers-operator
Labels: <none>
Annotations: <none>

Data
====
CLOUD_PROVIDER: libvirt

BinaryData
====
Events: <none>

Monitor the kata-oc machine config pool progress to ensure that it is in the UPDATED state, when UPDATEDMACHINECOUNT equals MACHINECOUNT, by running the following command:
```
$ watch oc get mcp/kata-oc
```
Verify the daemon set by running the following command:
```
$ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
```
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata-remote    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata-remote.

6.11. Modifying the number of peer pod VMs per node

You can modify the limit of peer pod virtual machines (VMs) per node by editing the peerpodConfig custom resource (CR).

Procedure

Check the current limit by running the following command:

$ oc get peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  -o jsonpath='{.spec.limit}{"\n"}'

Specify a new value for the limit key by running the following command:

$ oc patch peerpodconfig peerpodconfig-openshift -n openshift-sandboxed-containers-operator \
  --type merge --patch '{"spec":{"limit":"<value>"}}'

6.12. Customizing the Kata Agent policy

You can customize the Kata Agent policy to override the permissive default policy. The Kata Agent policy is a security mechanism that controls API requests for peer pods.

Important

You must override the default policy in a production environment.

As a minimum requirement, you must disable ExecProcessRequest to prevent a cluster administrator from accessing sensitive data by running the oc exec command on a peer pod.

You can use the default policy in development and test environments where security is not a concern, for example, in an environment where the control plane can be trusted.

A custom policy replaces the default policy entirely. To modify specific APIs, include the full policy and adjust the relevant rules.

Procedure

Create a custom policy.rego file by modifying the default policy:

package agent_policy

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateContainerRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := false
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default ExecProcessRequest := false
default SetPolicyRequest := false
default WriteStreamRequest := false

ExecProcessRequest if {
    input_command = concat(" ", input.process.Args)
    some allowed_command in policy_data.allowed_commands
    input_command == allowed_command
}

policy_data := {
  "allowed_commands": [
        "curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status"
  ]
}

The default policy allows all API calls. Adjust the true or false values to customize the policy further based on your needs.

Convert the policy.rego file to a Base64-encoded string by running the following command:
```
$ base64 -w0 policy.rego
```
Record the output.

Add the Base64-encoded policy string to the my-pod.yaml manifest:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  annotations:
    io.katacontainers.config.agent.policy: <base64_encoded_policy>
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <container_name>
    image: registry.access.redhat.com/ubi9/ubi:latest
    command:
    - sleep
    - "36000"
    securityContext:
      privileged: false
      seccompProfile:
        type: RuntimeDefault

Create the pod by running the following command:
```
$ oc create -f my-pod.yaml
```

6.13. Configuring a pull secret for peer pods

If you select a custom peer pod VM image from a private registry such as registry.access.redhat.com, you must configure a pull secret for peer pods.

Then, you can link the pull secret to the default service account or you can specify the pull secret in the peer pod manifest.

Procedure

Set the NS variable to the namespace where you deploy your peer pods:
```
$ NS=<namespace>
```
Copy the pull secret to the peer pod namespace:
```
$ oc get secret pull-secret -n openshift-config -o yaml \
  | sed "s/namespace: openshift-config/namespace: ${NS}/" \
  | oc apply -n "${NS}" -f -
```
You can use the cluster pull secret, as in this example, or a custom pull secret.
Optional: Link the pull secret to the default service account:
```
$ oc secrets link default pull-secret --for=pull -n ${NS}
```

Alternatively, add the pull secret to the peer pod manifest:

apiVersion: v1
kind: <Pod>
spec:
  containers:
  - name: <container_name>
    image: <image_name>
  imagePullSecrets:
  - name: pull-secret
# ...

6.14. Selecting a custom peer pod VM image

You create a new libvirt volume in your libvirt pool and upload the custom peer pod VM image to the new volume. Then, you update the pod manifest to use the custom peer pod VM image.

Prerequisites

If the custom peer pod VM image is in a private registry, you have created a pull secret.

Procedure

Set the LIBVIRT_POOL variable by running the following command:
```
$ export LIBVIRT_POOL=<libvirt_pool>
```
Set the LIBVIRT_VOL_NAME variable to a new libvirt volume by running the following command:
```
$ export LIBVIRT_VOL_NAME=<new_libvirt_volume>
```

Create a libvirt volume for the pool by running the following command:

$ virsh -c qemu:///system \
  vol-create-as --pool $LIBVIRT_POOL \
  --name $LIBVIRT_VOL_NAME \
  --capacity 20G \
  --allocation 2G \
  --prealloc-metadata \
  --format qcow2

Upload the custom peer pod VM image to the new libvirt volume:

$ virsh -c qemu:///system vol-upload \
  --vol $LIBVIRT_VOL_NAME <custom_podvm_image.qcow2> \
  --pool $LIBVIRT_POOL --sparse

Create a my-pod-manifest.yaml file according to the following example:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod-manifest
  annotations:
    io.katacontainers.config.hypervisor.image: "<new_libvirt_volume>"
spec:
  runtimeClassName: kata-remote
  containers:
  - name: <example_container>
    image: registry.access.redhat.com/ubi9/ubi:9.3
    command: ["sleep", "36000"]

Create the pod by running the following command:
```
$ oc create -f my-pod-manifest.yaml
```

6.15. Configuring your workload for OpenShift sandboxed containers

You configure your workload for OpenShift sandboxed containers by setting kata-remote as the runtime class for the following pod-templated objects:

Pod objects
ReplicaSet objects
ReplicationController objects
StatefulSet objects
Deployment objects
DeploymentConfig objects

Important

Do not deploy workloads in an Operator namespace. Create a dedicated namespace for these resources.

Prerequisites

You have created the KataConfig custom resource (CR).

Procedure

Add spec.runtimeClassName: kata-remote to the manifest of each pod-templated workload object as in the following example:
```
apiVersion: v1
kind: <object>
# ...
spec:
  runtimeClassName: kata-remote
# ...
```
Apply the changes to the workload object by running the following command:
```
$ oc apply -f <object.yaml>
```
OpenShift Container Platform creates the workload object and begins scheduling it.

Verification

Inspect the spec.runtimeClassName field of a pod-templated object. If the value is kata-remote, then the workload is running on OpenShift sandboxed containers.

!:ibm-osc:

Chapter 7. Monitoring

You can use the OpenShift Container Platform web console to monitor metrics related to the health status of your sandboxed workloads and nodes.

OpenShift sandboxed containers has a pre-configured dashboard available in the OpenShift Container Platform web console. Administrators can also access and query raw metrics through Prometheus.

7.1. About metrics

OpenShift sandboxed containers metrics enable administrators to monitor how their sandboxed containers are running. You can query for these metrics in Metrics UI In the OpenShift Container Platform web console.

OpenShift sandboxed containers metrics are collected for the following categories:

Kata agent metrics: Kata agent metrics display information about the kata agent process running in the VM embedded in your sandboxed containers. These metrics include data from /proc/<pid>/[io, stat, status].
Kata guest operating system metrics: Kata guest operating system metrics display data from the guest operating system running in your sandboxed containers. These metrics include data from /proc/[stats, diskstats, meminfo, vmstats] and /proc/net/dev.
Hypervisor metrics: Hypervisor metrics display data regarding the hypervisor running the VM embedded in your sandboxed containers. These metrics mainly include data from /proc/<pid>/[io, stat, status].
Kata monitor metrics: Kata monitor is the process that gathers metric data and makes it available to Prometheus. The kata monitor metrics display detailed information about the resource usage of the kata-monitor process itself. These metrics also include counters from Prometheus data collection.
Kata containerd shim v2 metrics: Kata containerd shim v2 metrics display detailed information about the kata shim process. These metrics include data from /proc/<pid>/[io, stat, status] and detailed resource usage metrics.

7.2. Viewing metrics

You can access the metrics for OpenShift sandboxed containers in the Metrics page In the OpenShift Container Platform web console.

Prerequisites

You have access to the cluster as a user with the cluster-admin role or with view permissions for all projects.

Procedure

In the OpenShift Container Platform web console, navigate to Observe → Metrics.
In the input field, enter the query for the metric you want to observe.
All kata-related metrics begin with kata. Typing kata displays a list of all available kata metrics.

The metrics from your query are visualized on the page.

Additional resources

Chapter 8. Uninstalling OpenShift sandboxed containers

You uninstall OpenShift sandboxed containers by performing the following tasks:

Delete the workload pods.
Delete the KataConfig custom resource (CR).
Uninstall the OpenShift sandboxed containers Operator.
Delete the KataConfig custom resource definition (CRD).

Important

You must delete the workload pods before deleting the KataConfig CR. The pod names usually have the prefix podvm and custom tags, if provided. If you deployed OpenShift sandboxed containers on a cloud provider and any resources remain after following these procedures, you might receive an unexpected bill for those resources from your cloud provider. Once you complete uninstalling OpenShift sandboxed containers on a cloud provider, check the cloud provider console to ensure that the procedures deleted all of the resources.

8.1. Deleting workload pods

You can delete the OpenShift sandboxed containers workload pods by using the CLI.

Prerequisites

You have the JSON processor (jq) utility installed.

Procedure

Search for the pods by running the following command:

$ oc get pods -A -o json | jq -r '.items[] | \
  select(.spec.runtimeClassName == "<runtime>").metadata.name'

Delete each pod by running the following command:
```
$ oc delete pod <pod>
```

Important

When uninstalling OpenShift sandboxed containers deployed using a cloud provider, you must delete all of the pods. Any remaining pod resources might result in an unexpected bill from your cloud provider.

8.2. Deleting the KataConfig custom resource

You delete the KataConfig custom resource (CR) by using the command line.

Procedure

Delete the KataConfig CR by running the following command:
```
$ oc delete kataconfig example-kataconfig
```
Verify the CR removal by running the following command:
```
$ oc get kataconfig example-kataconfig
```
Example output
```
No example-kataconfig instances exist
```

Important

You must ensure that all pods are deleted. Any remaining pod resources might result in an unexpected bill from your cloud provider.

8.3. Uninstalling the OpenShift sandboxed containers Operator

You uninstall the OpenShift sandboxed containers Operator by using the command line.

Procedure

Delete the subscription by running the following command:

$ oc delete subscription sandboxed-containers-operator -n openshift-sandboxed-containers-operator

Delete the namespace by running the following command:
```
$ oc delete namespace openshift-sandboxed-containers-operator
```

8.4. Deleting the KataConfig CRD

You delete the KataConfig custom resource definition (CRD) by using the command line.

Prerequisites

You have deleted the KataConfig custom resource.
You have uninstalled the OpenShift sandboxed containers Operator.

Procedure

Delete the KataConfig CRD by running the following command:
```
$ oc delete crd kataconfigs.kataconfiguration.openshift.io
```
Verify that the CRD was deleted by running the following command:
```
$ oc get crd kataconfigs.kataconfiguration.openshift.io
```
Example output
```
Unknown CRD kataconfigs.kataconfiguration.openshift.io
```

Chapter 9. Upgrading

The upgrade of the OpenShift sandboxed containers components consists of the following steps:

Upgrade OpenShift Container Platform to update the Kata runtime and its dependencies.
Upgrade the OpenShift sandboxed containers Operator to update the Operator subscription.

You can upgrade OpenShift Container Platform before or after the OpenShift sandboxed containers Operator upgrade, with the one exception noted below. Always apply the KataConfig patch immediately after upgrading the OpenShift sandboxed containers Operator.

9.1. Upgrading resources

Red Hat Enterprise Linux CoreOS (RHCOS) extensions deploy the OpenShift sandboxed containers resources onto the cluster.

The RHCOS extension sandboxed containers contains the required components to run OpenShift sandboxed containers, such as the Kata containers runtime, the hypervisor QEMU, and other dependencies. You upgrade the extension by upgrading the cluster to a new release of OpenShift Container Platform.

For more information about upgrading OpenShift Container Platform, see This content is not included.Updating Clusters.

9.2. Upgrading the Operator

Use Operator Lifecycle Manager (OLM) to upgrade the OpenShift sandboxed containers Operator either manually or automatically. Selecting between manual or automatic upgrade during the initial deployment determines the future upgrade mode. For manual upgrades, the OpenShift Container Platform web console shows the available updates that the cluster administrator can install.

For more information about upgrading the OpenShift sandboxed containers Operator in Operator Lifecycle Manager (OLM), see This content is not included.Updating installed Operators.

9.3. Updating the pod VM image

For peer pod deployments, you must update the pod VM image. Upgrading the OpenShift sandboxed containers Operator when the value of enablePeerpods: is true does not update the pod VM image automatically. You must also delete and re-create the KataConfig custom resource (CR).

Note

You can also check the peer pod config map for AWS and Azure deployments to ensure that the image ID is empty before re-creating the KataConfig CR.

9.3.1. Deleting the KataConfig custom resource

You delete the KataConfig custom resource (CR) by using the command line.

Procedure

Delete the KataConfig CR by running the following command:
```
$ oc delete kataconfig example-kataconfig
```
Verify the CR removal by running the following command:
```
$ oc get kataconfig example-kataconfig
```
Example output
```
No example-kataconfig instances exist
```

Important

You must ensure that all pods are deleted. Any remaining pod resources might result in an unexpected bill from your cloud provider.

9.3.2. Verifying the image ID is empty

For AWS and Azure deployments, after you delete the KataConfig custom resource (CR), you must verify that the image ID in the peer pods config map is empty.

Procedure

Obtain the peer pods config map by running the following command:
```
$ oc get configmap -n openshift-sandboxed-containers-operator peer-pods-cm -o jsonpath="{.data.<image_id>}" 1
```
1
For AWS, replace <image_id> with PODVM_AMI_ID. For Azure, replace <image_id> with AZURE_IMAGE_ID.
If the value is not empty, update the value and patch the config map by running the following command:
```
$ oc patch configmap peer-pods-cm -n openshift-sandboxed-containers-operator -p '{"data":{"<image_id>":""}}'
```

9.3.3. Creating the KataConfig custom resource

You must create the KataConfig custom resource (CR) to install kata-remote as a runtime class on your worker nodes.

OpenShift sandboxed containers installs kata-remote as a secondary, optional runtime on the cluster and not as the primary runtime.

Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors can increase the reboot time:

A large OpenShift Container Platform deployment with a greater number of worker nodes.
Activation of the BIOS and Diagnostics utility.
Deployment on a hard disk drive rather than an SSD.
Deployment on physical nodes such as bare metal, rather than on virtual nodes.
A slow CPU and network.

Procedure

Create an example-kataconfig.yaml manifest file according to the following example:
```
apiVersion: kataconfiguration.openshift.io/v1
kind: KataConfig
metadata:
  name: example-kataconfig
spec:
  enablePeerPods: true
  logLevel: info
#  kataConfigPoolSelector:
#    matchLabels:
#      <label_key>: '<label_value>'
```
<label_key>: '<label_value>'
Optional: If you have applied node labels to install kata-remote on specific nodes, specify the key and value, for example, kata-remote: 'true'.
Create the KataConfig CR by running the following command:
```
$ oc create -f example-kataconfig.yaml
```
The new KataConfig CR is created and installs kata-remote as a runtime class on the worker nodes.
Wait for the kata-remote installation to complete and the worker nodes to reboot before verifying the installation.
Monitor the installation progress by running the following command:
```
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"
```
When the status of all workers under kataNodes is installed and the condition InProgress is False without specifying a reason, the kata-remote is installed on the cluster.

Verify that you have built the peer pod image and uploaded it to the libvirt volume by running the following command:

$ oc describe configmap peer-pods-cm -n openshift-sandboxed-containers-operator

Example output

Name: peer-pods-cm
Namespace: openshift-sandboxed-containers-operator
Labels: <none>
Annotations: <none>

Data
====
CLOUD_PROVIDER: libvirt

BinaryData
====
Events: <none>

Monitor the kata-oc machine config pool progress to ensure that it is in the UPDATED state, when UPDATEDMACHINECOUNT equals MACHINECOUNT, by running the following command:
```
$ watch oc get mcp/kata-oc
```
Verify the daemon set by running the following command:
```
$ oc get -n openshift-sandboxed-containers-operator ds/osc-caa-ds
```
Verify the runtime classes by running the following command:
```
$ oc get runtimeclass
```
Example output
```
NAME           HANDLER             AGE
kata            kata                34m
kata-remote    kata-remote        152m
```
You can also see the default kata runtime class in addition to kata-remote.

Chapter 10. Troubleshooting

When troubleshooting OpenShift sandboxed containers, you can open a support case and provide debugging information using the must-gather tool.

If you are a cluster administrator, you can also review logs on your own, enabling a more detailed level of logs.

10.1. Collecting data for Red Hat Support

When opening a support case, it is helpful to provide debugging information about your cluster to Red Hat Support.

The must-gather tool enables you to collect diagnostic information about your OpenShift Container Platform cluster, including virtual machines and other data related to OpenShift sandboxed containers.

For prompt support, supply diagnostic information for both OpenShift Container Platform and OpenShift sandboxed containers.

Using the must-gather tool

The oc adm must-gather CLI command collects the information from your cluster that is most likely needed for debugging issues, including:

Resource definitions
Service logs

By default, the oc adm must-gather command uses the default plugin image and writes into ./must-gather.local.

Alternatively, you can collect specific information by running the command with the appropriate arguments as described in the following sections:

To collect data related to one or more specific features, use the --image argument with an image, as listed in a following section.
For example:
```
$ oc adm must-gather --image=registry.redhat.io/openshift-sandboxed-containers/osc-must-gather-rhel9:1.11.1
```
To collect the audit logs, use the -- /usr/bin/gather_audit_logs argument, as described in a following section.
For example:
```
$ oc adm must-gather -- /usr/bin/gather_audit_logs
```
Note
Audit logs are not collected as part of the default set of information to reduce the size of the files.

When you run oc adm must-gather, a new pod with a random name is created in a new project on the cluster. The data is collected on that pod and saved in a new directory that starts with must-gather.local. This directory is created in the current working directory.

For example:

NAMESPACE                      NAME                 READY   STATUS      RESTARTS      AGE
...
openshift-must-gather-5drcj    must-gather-bklx4    2/2     Running     0             72s
openshift-must-gather-5drcj    must-gather-s8sdh    2/2     Running     0             72s
...

Optionally, you can run the oc adm must-gather command in a specific namespace by using the --run-namespace option.

For example:

$ oc adm must-gather --run-namespace <namespace> --image=registry.redhat.io/openshift-sandboxed-containers/osc-must-gather-rhel9:1.11.1

10.2. Collecting log data

The following features and objects are associated with OpenShift sandboxed containers:

All namespaces and their child objects that belong to OpenShift sandboxed containers resources
All OpenShift sandboxed containers custom resource definitions (CRDs)

You can collect the following component logs for each pod running with the kata runtime:

Kata agent logs
Kata runtime logs
QEMU logs
Audit logs
CRI-O logs

10.2.1. Enabling debug logs for CRI-O runtime

You can enable debug logs by updating the logLevel field in the KataConfig CR. This changes the log level in the CRI-O runtime for the worker nodes running OpenShift sandboxed containers.

Prerequisites

You have installed the OpenShift CLI (oc).
You have access to the cluster as a user with the cluster-admin role.

Procedure

Change the logLevel field in your existing KataConfig CR to debug:

$ oc patch kataconfig <kataconfig> --type merge --patch '{"spec":{"logLevel":"debug"}}'

Monitor the kata-oc machine config pool until the value of UPDATED is True, indicating that all worker nodes are updated:

$ oc get mcp kata-oc

Example output

NAME     CONFIG                 UPDATED  UPDATING  DEGRADED  MACHINECOUNT  READYMACHINECOUNT  UPDATEDMACHINECOUNT  DEGRADEDMACHINECOUNT  AGE
kata-oc  rendered-kata-oc-169   False    True      False     3             1                  1                    0                     9h

Verification

Start a debug session with a node in the machine config pool:
```
$ oc debug node/<node_name>
```
Change the root directory to /host:
```
# chroot /host
```
Verify the changes in the crio.conf file:
```
# crio config | egrep 'log_level
```
Example output
```
log_level = "debug"
```

10.2.2. Viewing debug logs for components

Cluster administrators can use the debug logs to troubleshoot issues. The logs for each node are printed to the node journal.

You can review the logs for the following OpenShift sandboxed containers components:

Kata agent
Kata runtime (containerd-shim-kata-v2)
virtiofsd

QEMU only generates warning and error logs. These warnings and errors print to the node journal in both the Kata runtime logs and the CRI-O logs with an extra qemuPid field.

Example of QEMU logs

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.587116986Z" level=info msg="Start logging QEMU (qemuPid=2241693)" name=containerd-shim-v2 pid=2241647 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.607339014Z" level=error msg="qemu-kvm: -machine q35,accel=kvm,kernel_irqchip=split,foo: Expected '=' after parameter 'foo'" name=containerd-shim-v2 pid=2241647 qemuPid=2241693 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

Mar 11 11:57:28 openshift-worker-0 kata[2241647]: time="2023-03-11T11:57:28.60890737Z" level=info msg="Stop logging QEMU (qemuPid=2241693)" name=containerd-shim-v2 pid=2241647 sandbox=d1d4d68efc35e5ccb4331af73da459c13f46269b512774aa6bde7da34db48987 source=virtcontainers/hypervisor subsystem=qemu

The Kata runtime prints Start logging QEMU when QEMU starts, and Stop Logging QEMU when QEMU stops. The error appears in between these two log messages with the qemuPid field. The actual error message from QEMU appears in red.

The console of the QEMU guest is printed to the node journal as well. You can view the guest console logs together with the Kata agent logs.

Prerequisites

You have installed the OpenShift CLI (oc).
You have access to the cluster as a user with the cluster-admin role.

Procedure

To review the Kata agent logs and guest console logs, run the following command:

$ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata -g “reading guest console”

To review the Kata runtime logs, run the following command:
```
$ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata
```

To review the virtiofsd logs, run the following command:

$ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t virtiofsd

To review the QEMU logs, run the following command:

$ oc debug node/<nodename> -- journalctl -D /host/var/log/journal -t kata -g "qemuPid=\d+"

10.3. Additional resources

This content is not included.Gathering data about your cluster

Appendix A. KataConfig status messages

The following table displays the status messages for the KataConfig custom resource (CR) for a cluster with two worker nodes.

Table A.1. KataConfig status messages

Status	Description
Initial installation When a `KataConfig` CR is created and starts installing `kata-remote` on both workers, the following status is displayed for a few seconds.	conditions: message: Performing initial installation of kata-remote on cluster reason: Installing status: 'True' type: InProgress kataNodes: nodeCount: 0 readyNodeCount: 0
Installing Within a few seconds the status changes.	kataNodes: nodeCount: 2 readyNodeCount: 0 waitingToInstall: - worker-0 - worker-1
Installing (Worker-1 installation starting) For a short period of time, the status changes, signifying that one node has initiated the installation of `kata-remote`, while the other is in a waiting state. This is because only one node can be unavailable at any given time. The `nodeCount` remains at 2 because both nodes will eventually receive `kata-remote`, but the `readyNodeCount` is currently 0 as neither of them has reached that state yet.	kataNodes: installing: - worker-1 nodeCount: 2 readyNodeCount: 0 waitingToInstall: - worker-0
Installing (Worker-1 installed, worker-0 installation started) After some time, `worker-1` will complete its installation, causing a change in the status. The `readyNodeCount` is updated to 1, indicating that `worker-1` is now prepared to execute `kata-remote` workloads. You cannot schedule or run `kata-remote` workloads until the runtime class is created at the end of the installation process.	kataNodes: installed: - worker-1 installing: - worker-0 nodeCount: 2 readyNodeCount: 1
Installed When installed, both workers are listed as installed, and the `InProgress` condition transitions to `False` without specifying a reason, indicating the successful installation of `kata-remote` on the cluster.	conditions: message: "" reason: "" status: 'False' type: InProgress kataNodes: installed: - worker-0 - worker-1 nodeCount: 2 readyNodeCount: 2

Status Description

Status	Description
Initial uninstall If `kata-remote` is installed on both workers, and you delete the `KataConfig` to remove `kata-remote` from the cluster, both workers briefly enter a waiting state for a few seconds.	conditions: message: Removing kata-remote from cluster reason: Uninstalling status: 'True' type: InProgress kataNodes: nodeCount: 0 readyNodeCount: 0 waitingToUninstall: - worker-0 - worker-1
Uninstalling After a few seconds, one of the workers starts uninstalling.	kataNodes: nodeCount: 0 readyNodeCount: 0 uninstalling: - worker-1 waitingToUninstall: - worker-0
Uninstalling Worker-1 finishes and worker-0 starts uninstalling.	kataNodes: nodeCount: 0 readyNodeCount: 0 uninstalling: - worker-0

Initial uninstall

If kata-remote is installed on both workers, and you delete the KataConfig to remove kata-remote from the cluster, both workers briefly enter a waiting state for a few seconds.

 conditions:
    message: Removing kata-remote from cluster
    reason: Uninstalling
    status: 'True'
    type: InProgress
 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   waitingToUninstall:
   - worker-0
   - worker-1

Uninstalling

After a few seconds, one of the workers starts uninstalling.

 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   uninstalling:
   - worker-1
   waitingToUninstall:
   - worker-0

Uninstalling

Worker-1 finishes and worker-0 starts uninstalling.

 kataNodes:
   nodeCount: 0
   readyNodeCount: 0
   uninstalling:
   - worker-0

Note

The reason field can also report the following causes:

Failed: This is reported if the node cannot finish its transition. The status reports True and the message is Node <node_name> Degraded: <error_message_from_the_node>.
BlockedByExistingKataPods: This is reported if there are pods running on a cluster that use the kata-remote runtime while kata-remote is being uninstalled. The status field is False and the message is Existing pods using "kata-remote" RuntimeClass found. Please delete the pods manually for KataConfig deletion to proceed. There could also be a technical error message reported like Failed to list kata pods: <error_message> if communication with the cluster control plane fails.

Legal Notice

Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.

Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.

Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.

Linux® is the registered trademark of Linus Torvalds in the United States and other countries.

XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.

The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.

All other trademarks are the property of their respective owners.