Operator enterprise topology - Red Hat Ansible Automation Platform 2.7

The Ansible Automation Platform Service on AWS is an example of an OpenShift Operator based enterprise topology.

Included are the tested infrastructure topology, system requirements, network port configurations, and an example custom resource file for installation.

Important:

You can only install a single instance of the Ansible Automation Platform Operator into a single namespace. Installing multiple instances in the same namespace can lead to improper operation for both Operator instances.

Infrastructure topology

The Red Hat tested infrastructure topology for this deployment model:

Operator enterprise topology diagram — Figure 1. Infrastructure topology diagram

Important:

While Redis and PostgreSQL can be installed as part of the operator-based installation process, the topology diagram represents a Red Hat supported topology where both Redis and PostgreSQL are external to Ansible Automation Platform.

This infrastructure topology describes an OpenShift Cluster with 3 primary nodes and 2 worker nodes.

Red Hat tests each OpenShift Worker node with these requirements:

Table 1. OpenShift Worker node requirements
Requirement	Minimum requirement
RAM	16 GB
CPUs	4
Local disk	128 GB
Disk IOPS	3000

Table 2. Infrastructure topology components
Count	Component
1	Automation controller web pod
1	Automation controller task pod
1	Automation hub web pod
1	Automation hub API pod
2	Automation hub content pod
2	Automation hub worker pod
1	Automation hub Redis pod
1	Event-Driven Ansible API pod
2	Event-Driven Ansible activation worker pod
2	Event-Driven Ansible default worker pod
2	Event-Driven Ansible event stream pod
1	Event-Driven Ansible scheduler pod
1	Platform gateway pod
1	Metrics service web pod
1	Metrics service tasks pod
1	Metrics service scheduler pod
2	Mesh ingress pod
N/A	Externally managed database service
N/A	Externally managed Redis
N/A	Externally managed object storage service (for automation hub)

Tested system configurations

Red Hat has tested these configurations to install and run Red Hat Ansible Automation Platform:

Table 3. Tested system configurations
Type	Description
Subscription	Valid Red Hat Ansible Automation Platform subscription
Red Hat OpenShift	Red Hat OpenShift on AWS Hosted Control Planes 4.15.16 2 worker nodes in different availability zones (AZs) at t3.xlarge
Ansible-core	Ansible-core version 2.16 or later
Browser	A currently supported version of Mozilla Firefox or Google Chrome.
AWS RDS PostgreSQL service	engine: "postgres" engine_version: 15" parameter_group_name: "default.postgres15" allocated_storage: 20 max_allocated_storage: 1000 storage_type: "gp2" storage_encrypted: true instance_class: "db.t4g.small" multi_az: true backup_retention_period: 5 database: must have International Components for Unicode (ICU) support databases required: `automationcontroller`, `automationhub`, `automationeda`, `metrics_service` Note: Minimum external database requirements The external database must meet these minimum requirements: 4 vCPUs 16 GB RAM max_connections: 1024 (minimum). You might need more connections when scaling replicas. 200 GB storage on a volume capable of at least 3000 IOPS. Database storage consumption depends on your workload, including job frequency, playbook task count, output verbosity, and the number of managed hosts per job. Start with a 200 GB baseline and monitor actual usage after deployment. Configure automated cleanup jobs to prevent unbounded database growth. These requirements ensure adequate database performance for the enterprise topology workload profile.
Metrics service database	Database: `metrics_service` User: metrics database user with CREATEDB role Storage: 40 GB minimum (plan for 100 GB with data growth) Connections: 100 connections minimum Read-only access to `automationcontroller` database: User: `ms_awx_readonly` with SELECT on all tables in public schema Requires ALTER DEFAULT PRIVILEGES for future tables
AWS Memcached Service	engine: "redis" engine_version: "6.2" auto_minor_version_upgrade: "false" node_type: "cache.t3.micro" parameter_group_name: "default.redis6.x.cluster.on" transit_encryption_enabled: "true" num_node_groups: 2 replicas_per_node_group: 1 automatic_failover_enabled: true
s3 storage	HTTPS only accessible through AWS Role assigned to automation hub SA at runtime by using AWS Pod Identity
IP version	IPv4, IPv6 (single-stack and dual-stack)

Note:

Minimum external database requirements

The external database must meet these minimum requirements:

4 vCPUs
16 GB RAM
max_connections: 1024 (minimum). You might need more connections when scaling replicas.
200 GB storage on a volume capable of at least 3000 IOPS.
Support for 4 separate databases: automationcontroller, automationhub, automationeda, metrics_service
Cross-database permissions: metrics_service database requires ms_awx_readonly user with SELECT privileges on automationcontroller database

Database storage consumption depends on your workload, including job frequency, playbook task count, output verbosity, and the number of managed hosts per job. Start with a 200 GB baseline and monitor actual usage after deployment. Configure automated cleanup jobs to prevent unbounded database growth. These requirements ensure adequate database performance for the enterprise topology workload profile.

Example custom resource file

For example CR files, see the Content from github.com is not included.ocp-b.env-a directory in the test-topologies GitHub repository.

The following example shows an AnsibleAutomationPlatform custom resource configured for enterprise topology with external databases.

apiVersion: aap.ansible.com/v1alpha1
kind: AnsibleAutomationPlatform
metadata:
  name: aap
  namespace: aap
spec:
  controller:
    postgres_configuration_secret: <controller-db-secret>

  hub:
    storage_type: s3
    object_storage_s3_secret: <s3-secret>

  eda:
    automation_server_ssl_verify: "no"

  metrics:
    database:
      database_secret: aap-metrics-postgres-configuration
      externally_managed: true
    ms_awx_readonly_user_secret: aap-metrics-read-token
    ms_awx_readonly_user:
      externally_managed: true

Nonfunctional requirements

Ansible Automation Platform’s performance characteristics and capacity depend on its resource allocation and configuration. With OpenShift, each Ansible Automation Platform component deploys as a pod. You can specify resource requests and limits for each pod.

Use the Ansible Automation Platform custom resource to configure resource allocation for OpenShift installations. Each configurable item has default settings. These settings are the exact configuration used in this reference deployment architecture. This configuration assumes deployment and management by an Enterprise IT organization for production purposes.

By default, each component’s deployments use minimum resource requests but no resource limits. OpenShift only schedules pods with available resource requests. However, pods can consume unlimited RAM or CPU as long as the OpenShift worker node is not under node pressure.

In the Operator enterprise topology, Ansible Automation Platform runs on a Red Hat OpenShift on AWS (ROSA) Hosted Control Plane (HCP) cluster. The cluster has 2 t3.xlarge worker nodes spread across 2 AWS availability zones within a single region. This is not a shared environment so Ansible Automation Platform pods have full access to all compute resources of the ROSA HCP cluster.

The capacity calculation for automation controller task pods comes from the underlying HCP worker node running the pod. It does not have access to the CPU or memory resources of the entire node. This capacity calculation influences how many concurrent jobs automation controller can run.

OpenShift manages storage distinctly from VMs. This impacts how automation hub stores its artifacts. In the Operator enterprise topology, automation hub uses S3 storage. automation hub requires ReadWriteMany type storage, which is not a default storage type in OpenShift.

This topology specifies externally provided Redis, PostgreSQL, and object storage for automation hub. This provides additional scalability and reliability features for the Ansible Automation Platform deployment. These features include specialized backup, restore, and replication services, as well as scalable storage.

Metrics service resource allocation

Note:

Resource allocation can be configured in the AnsibleAutomationPlatform custom resource. If using external databases, configure database secrets before setting resource allocation.

In enterprise topology, metrics service runs as 3 pods with the following resource recommendations:

Table 4. Metrics service resource allocation
Pod	CPU Request	Memory Request	CPU Limit	Memory Limit	Replicas
metrics-web	500m	2 Gi	1000m	4 Gi	1-2
metrics-tasks	500m	2 Gi	1000m	4 Gi	1
metrics-scheduler	500m	2 Gi	1000m	4 Gi	1 (must not scale)

Scaling considerations:

metrics-web pod: Can be scaled to 2 replicas for high availability and load distribution
metrics-tasks pod: Can not be scaled past 1 replica
metrics-scheduler pod: Must remain at 1 replica to prevent duplicate scheduled tasks

Configure pod resource requests and limits in the AnsibleAutomationPlatform CR:

spec:
  metrics:
    disabled: false
    web:
      replicas: 2
      resource_requirements:
        requests:
          cpu: 500m
          memory: 2Gi
        limits:
          cpu: 1000m
          memory: 4Gi
    task:
      replicas: 2
      resource_requirements:
        requests:
          cpu: 500m
          memory: 2Gi
        limits:
          cpu: 1000m
          memory: 4Gi
    scheduler:
      replicas: 1  # Must be 1
      resource_requirements:
        requests:
          cpu: 100m
          memory: 512Mi
        limits:
          cpu: 200m
          memory: 1Gi

Note:

For enterprise deployments, configure pod anti-affinity to spread metrics service pods across different worker nodes:

spec:
  metrics:
    web:
      topology_spread_constraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: PreferNoSchedule

Metrics service automatic provisioning

When you create an AnsibleAutomationPlatform custom resource with metrics service enabled, the operator automatically provisions:

1. MetricsService custom resource

Defines the metrics service deployment (web, tasks, and scheduler pods)
Configures database connection secrets
Sets resource limits and replicas

2. Database configuration

Reads customer-provided database secrets (external database scenario) or creates managed database credentials
Creates Kubernetes Secrets for both database connections:
- <instance>-automationmetricsservice-postgres-configuration - metrics service database
- <instance>-automationmetricsservice-awx-postgres-configuration - automation controller read-only credentials
Database connectivity is verified at pod start time by an init container in the web pod, which polls until manage.py check --database default succeeds. The AWX read-only connection is validated at application runtime, not during operator reconciliation.

3. Service routing

Creates a Kubernetes Service (<instance>-automationmetricsservice-service) on port 8000, targeting the web pod on port 8080
Registers the metrics service with the platform gateway (Envoy) at /api/metrics/, making the API accessible through the standard Ansible Automation Platform gateway URL

4. Backup integration

Backup resources are not created automatically during provisioning. They are created on-demand when you trigger a backup by applying an AnsibleAutomationPlatformBackup custom resource. The operator then creates a MetricsServiceBackup CR, which provisions a PersistentVolumeClaim for backup staging and runs a pg_dump of the metrics database.

Validation

After operator reconciliation completes, verify metrics service provisioning:

# Check MetricsService CR status
oc get metricsservice -n <namespace>

# Verify all 3 pods are running
oc get pods -n <namespace> | grep automationmetricsservice

Expected output:

<aap-name>-automationmetricsservice-web-xxxxx         1/1  Running
<aap-name>-automationmetricsservice-tasks-xxxxx       1/1  Running
<aap-name>-automationmetricsservice-scheduler-xxxxx   1/1  Running

# Check web pod init container logs for database readiness
oc logs <aap-name>-automationmetricsservice-web-xxxxx -c wait-for-db -n <namespace>
# Should show: "Database is ready"

# Verify the service exists and is on the correct port
oc get svc -n <namespace> | grep automationmetricsservice
# Expected: <aap-name>-automationmetricsservice-service   ClusterIP   ...   8000/TCP

Network ports

Red Hat Ansible Automation Platform uses several ports to communicate with its services. These ports must be open and available for Red Hat Ansible Automation Platform to work. Ensure that these ports are available and are not blocked by a firewall.

Table 5. Network ports and protocols
Port number	Protocol	Service	Source	Destination
80/443	HTTP/HTTPS	Object storage	OpenShift Container Platform cluster	External object storage service
80/443	HTTP/HTTPS	Receptor	Execution node	OpenShift Container Platform ingress
80/443	HTTP/HTTPS	Receptor	Hop node	OpenShift Container Platform ingress
5432	TCP	PostgreSQL	OpenShift Container Platform cluster	External database service
5432	TCP	PostgreSQL	OpenShift Container Platform cluster	External database service (`metrics_service` database)
5432	TCP	PostgreSQL	OpenShift Container Platform cluster	External database service (`automationcontroller` database - read-only for metrics service)
6379	TCP	Redis	OpenShift Container Platform cluster	External Redis service
27199	TCP	Receptor	OpenShift Container Platform cluster	Execution node
27199	TCP	Receptor	OpenShift Container Platform cluster	Hop node

Note:

Metrics service pods communicate internally within the OpenShift cluster via the platform gateway. The /api/metrics/ path is routed through the standard Ansible Automation Platform gateway and does not require a separate external ingress. Metrics service requires outbound connectivity on port 5432 to both the metrics_service database and the automationcontroller database (read-only).