Operator enterprise topology
The Operator-based enterprise topology provides redundancy and higher compute for large volumes of automation on Red Hat OpenShift Container Platform.
The Ansible Automation Platform Service on AWS is an example of an OpenShift Operator based enterprise topology.
Included are the tested infrastructure topology, system requirements, network port configurations, and an example custom resource file for installation.
You can only install a single instance of the Ansible Automation Platform Operator into a single namespace. Installing multiple instances in the same namespace can lead to improper operation for both Operator instances.
Infrastructure topology
The Red Hat tested infrastructure topology for this deployment model:
While Redis and PostgreSQL can be installed as part of the operator-based installation process, the topology diagram represents a Red Hat supported topology where both Redis and PostgreSQL are external to Ansible Automation Platform.
This infrastructure topology describes an OpenShift Cluster with 3 primary nodes and 2 worker nodes.
Red Hat tests each OpenShift Worker node with these requirements:
| Requirement | Minimum requirement |
|---|---|
| RAM | 16 GB |
| CPUs | 4 |
| Local disk | 128 GB |
| Disk IOPS | 3000 |
| Count | Component |
|---|---|
| 1 |
Automation controller web pod |
| 1 |
Automation controller task pod |
| 1 |
Automation hub web pod |
| 1 |
Automation hub API pod |
| 2 |
Automation hub content pod |
| 2 |
Automation hub worker pod |
| 1 |
Automation hub Redis pod |
| 1 |
Event-Driven Ansible API pod |
| 2 |
Event-Driven Ansible activation worker pod |
| 2 |
Event-Driven Ansible default worker pod |
| 2 |
Event-Driven Ansible event stream pod |
| 1 |
Event-Driven Ansible scheduler pod |
| 1 |
Platform gateway pod |
| 1 |
Metrics service web pod |
| 1 |
Metrics service tasks pod |
| 1 |
Metrics service scheduler pod |
| 2 |
Mesh ingress pod |
| N/A |
Externally managed database service |
| N/A |
Externally managed Redis |
| N/A |
Externally managed object storage service (for automation hub) |
Tested system configurations
Red Hat has tested these configurations to install and run Red Hat Ansible Automation Platform:
| Type | Description |
|---|---|
| Subscription |
Valid Red Hat Ansible Automation Platform subscription |
| Red Hat OpenShift |
|
| Ansible-core |
Ansible-core version 2.16 or later |
| Browser |
A currently supported version of Mozilla Firefox or Google Chrome. |
| AWS RDS PostgreSQL service |
Note:
Minimum external database requirements The external database must meet these minimum requirements:
Database storage consumption depends on your workload, including job frequency, playbook task count, output verbosity, and the number of managed hosts per job. Start with a 200 GB baseline and monitor actual usage after deployment. Configure automated cleanup jobs to prevent unbounded database growth. These requirements ensure adequate database performance for the enterprise topology workload profile. |
| Metrics service database |
Read-only access to
|
| AWS Memcached Service |
|
| s3 storage |
HTTPS only accessible through AWS Role assigned to automation hub SA at runtime by using AWS Pod Identity |
| IP version |
IPv4, IPv6 (single-stack and dual-stack) |
Minimum external database requirements
The external database must meet these minimum requirements:
- 4 vCPUs
- 16 GB RAM
max_connections: 1024 (minimum). You might need more connections when scaling replicas.- 200 GB storage on a volume capable of at least 3000 IOPS.
- Support for 4 separate databases:
automationcontroller,automationhub,automationeda,metrics_service - Cross-database permissions:
metrics_servicedatabase requiresms_awx_readonlyuser with SELECT privileges onautomationcontrollerdatabase
Database storage consumption depends on your workload, including job frequency, playbook task count, output verbosity, and the number of managed hosts per job. Start with a 200 GB baseline and monitor actual usage after deployment. Configure automated cleanup jobs to prevent unbounded database growth. These requirements ensure adequate database performance for the enterprise topology workload profile.
Example custom resource file
For example CR files, see the Content from github.com is not included.ocp-b.env-a directory in the test-topologies GitHub repository.
The following example shows an AnsibleAutomationPlatform custom resource configured for enterprise topology with external databases.
apiVersion: aap.ansible.com/v1alpha1
kind: AnsibleAutomationPlatform
metadata:
name: aap
namespace: aap
spec:
controller:
postgres_configuration_secret: <controller-db-secret>
hub:
storage_type: s3
object_storage_s3_secret: <s3-secret>
eda:
automation_server_ssl_verify: "no"
metrics:
database:
database_secret: aap-metrics-postgres-configuration
externally_managed: true
ms_awx_readonly_user_secret: aap-metrics-read-token
ms_awx_readonly_user:
externally_managed: trueNonfunctional requirements
Ansible Automation Platform’s performance characteristics and capacity depend on its resource allocation and configuration. With OpenShift, each Ansible Automation Platform component deploys as a pod. You can specify resource requests and limits for each pod.
Use the Ansible Automation Platform custom resource to configure resource allocation for OpenShift installations. Each configurable item has default settings. These settings are the exact configuration used in this reference deployment architecture. This configuration assumes deployment and management by an Enterprise IT organization for production purposes.
By default, each component’s deployments use minimum resource requests but no resource limits. OpenShift only schedules pods with available resource requests. However, pods can consume unlimited RAM or CPU as long as the OpenShift worker node is not under node pressure.
In the Operator enterprise topology, Ansible Automation Platform runs on a Red Hat OpenShift on AWS (ROSA) Hosted Control Plane (HCP) cluster. The cluster has 2 t3.xlarge worker nodes spread across 2 AWS availability zones within a single region. This is not a shared environment so Ansible Automation Platform pods have full access to all compute resources of the ROSA HCP cluster.
The capacity calculation for automation controller task pods comes from the underlying HCP worker node running the pod. It does not have access to the CPU or memory resources of the entire node. This capacity calculation influences how many concurrent jobs automation controller can run.
OpenShift manages storage distinctly from VMs. This impacts how automation hub stores its artifacts. In the Operator enterprise topology, automation hub uses S3 storage. automation hub requires ReadWriteMany type storage, which is not a default storage type in OpenShift.
This topology specifies externally provided Redis, PostgreSQL, and object storage for automation hub. This provides additional scalability and reliability features for the Ansible Automation Platform deployment. These features include specialized backup, restore, and replication services, as well as scalable storage.
Metrics service resource allocation
In enterprise topology, metrics service runs as 3 pods with the following resource recommendations:
| Pod | CPU Request | Memory Request | CPU Limit | Memory Limit | Replicas |
|---|---|---|---|---|---|
| metrics-web | 500m | 2 Gi | 1000m | 4 Gi | 1-2 |
| metrics-tasks | 500m | 2 Gi | 1000m | 4 Gi | 1 |
| metrics-scheduler | 500m | 2 Gi | 1000m | 4 Gi | 1 (must not scale) |
Scaling considerations:
- metrics-web pod: Can be scaled to 2 replicas for high availability and load distribution
- metrics-tasks pod: Can not be scaled past 1 replica
- metrics-scheduler pod: Must remain at 1 replica to prevent duplicate scheduled tasks
Configure pod resource requests and limits in the AnsibleAutomationPlatform CR:
spec:
metrics:
disabled: false
web:
replicas: 2
resource_requirements:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 1000m
memory: 4Gi
task:
replicas: 2
resource_requirements:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 1000m
memory: 4Gi
scheduler:
replicas: 1 # Must be 1
resource_requirements:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 200m
memory: 1Gispec:
metrics:
web:
topology_spread_constraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: PreferNoScheduleMetrics service automatic provisioning
When you create an AnsibleAutomationPlatform custom resource with metrics service enabled, the operator automatically provisions:
1. MetricsService custom resource
- Defines the metrics service deployment (web, tasks, and scheduler pods)
- Configures database connection secrets
- Sets resource limits and replicas
2. Database configuration
- Reads customer-provided database secrets (external database scenario) or creates managed database credentials
- Creates Kubernetes Secrets for both database connections:
<instance>-automationmetricsservice-postgres-configuration - metrics service database<instance>-automationmetricsservice-awx-postgres-configuration - automation controller read-only credentials
- Database connectivity is verified at pod start time by an init container in the web pod, which polls until
manage.py check --database defaultsucceeds. The AWX read-only connection is validated at application runtime, not during operator reconciliation.
3. Service routing
- Creates a Kubernetes Service (
<instance>-automationmetricsservice-service) on port 8000, targeting the web pod on port 8080 - Registers the metrics service with the platform gateway (Envoy) at
/api/metrics/, making the API accessible through the standard Ansible Automation Platform gateway URL
4. Backup integration
- Backup resources are not created automatically during provisioning. They are created on-demand when you trigger a backup by applying an AnsibleAutomationPlatformBackup custom resource. The operator then creates a MetricsServiceBackup CR, which provisions a PersistentVolumeClaim for backup staging and runs a
pg_dumpof the metrics database.
Validation
After operator reconciliation completes, verify metrics service provisioning:
# Check MetricsService CR status
oc get metricsservice -n <namespace>
# Verify all 3 pods are running
oc get pods -n <namespace> | grep automationmetricsserviceExpected output:
<aap-name>-automationmetricsservice-web-xxxxx 1/1 Running
<aap-name>-automationmetricsservice-tasks-xxxxx 1/1 Running
<aap-name>-automationmetricsservice-scheduler-xxxxx 1/1 Running# Check web pod init container logs for database readiness
oc logs <aap-name>-automationmetricsservice-web-xxxxx -c wait-for-db -n <namespace>
# Should show: "Database is ready"
# Verify the service exists and is on the correct port
oc get svc -n <namespace> | grep automationmetricsservice
# Expected: <aap-name>-automationmetricsservice-service ClusterIP ... 8000/TCPNetwork ports
Red Hat Ansible Automation Platform uses several ports to communicate with its services. These ports must be open and available for Red Hat Ansible Automation Platform to work. Ensure that these ports are available and are not blocked by a firewall.
| Port number | Protocol | Service | Source | Destination |
|---|---|---|---|---|
| 80/443 |
HTTP/HTTPS |
Object storage |
OpenShift Container Platform cluster |
External object storage service |
| 80/443 |
HTTP/HTTPS |
Receptor |
Execution node |
OpenShift Container Platform ingress |
| 80/443 |
HTTP/HTTPS |
Receptor |
Hop node |
OpenShift Container Platform ingress |
| 5432 |
TCP |
PostgreSQL |
OpenShift Container Platform cluster |
External database service |
| 5432 |
TCP |
PostgreSQL |
OpenShift Container Platform cluster |
External database service ( |
| 5432 |
TCP |
PostgreSQL |
OpenShift Container Platform cluster |
External database service ( |
| 6379 |
TCP |
Redis |
OpenShift Container Platform cluster |
External Redis service |
| 27199 |
TCP |
Receptor |
OpenShift Container Platform cluster |
Execution node |
| 27199 |
TCP |
Receptor |
OpenShift Container Platform cluster |
Hop node |
Metrics service pods communicate internally within the OpenShift cluster via the platform gateway. The /api/metrics/ path is routed through the standard Ansible Automation Platform gateway and does not require a separate external ingress. Metrics service requires outbound connectivity on port 5432 to both the metrics_service database and the automationcontroller database (read-only).