Autoscaling for RGW in ODF via HPA using KEDA

Updated

Installing KEDA operator

Currently, the community version of the KEDA operator is available in the Operator Hub. You can directly install the KEDA operator. The downstream version of this operator will be available in ODF 4.10.x release. After installing the KEDA operator, you need to create a KEDA controller. If KEDA controller creation is successful, then the following pods will be running in the keda namespace.

 oc get pods -n keda
NAME                                      READY   STATUS    RESTARTS   AGE
keda-metrics-apiserver-77c78b85c4-wjwht   1/1     Running   0          2m26s
keda-olm-operator-866fdbdb6f-vtt58        1/1     Running   0          11m
keda-operator-f76d844d7-szw6k             1/1     Running   0          2m26s

NOTE: KEDA operator version should be at least 2.6.

Configuring KEDA with Thanos Query Service

For autoscaling, KEDA uses different scalers and Prometheus is one of them. In Openshift, Prometheus metrics can be fetched from Thanos Query Service. This service is needed to authenticate via bearer token. So, KEDA with Thanos query service is authenticated using bearer token authentication.

  1. Define a cluster role using the following:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: thanos-metrics-reader
    rules:
    - apiGroups:
      - ""
      resources:
      - pods
      verbs:
      - get
    - apiGroups:
      - metrics.k8s.io
      resources:
      - pods
      - nodes
      verbs:
      - get
      - list
      - watch
    
  2. Add a role binding to this cluster role with a Service Account (new/existing) in the openshift-storage namespace using the following command:

oc adm policy add-role-to-user thanos-metrics-reader -z SERVICE_ACCOUNT --role-namespace=openshift-storage

  1. Fetch secret details of the Service Account which contains the token for the authentication:

SECRET=oc get secret -n openshift-storage | grep <SERVICE_ACCOUNT>-token | head -n 1 | awk '{print $1 }'

  1. Now, create triggerAuthentication CRD for the bearer authentication using the secret name:

    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: keda-trigger-auth-prometheus
      namespace: openshift-storage
    spec:
      secretTargetRef:
      - parameter: bearerToken
        name: <SECRET>
        key: token
      - parameter: ca
        name: <SECRET>
        key: ca.crt
    

Creating ScaledObject for autoscaling RGW

The last step is to create ScaledObject for RGW. Here, autoscaling using the HPA feature of Kubernetes triggers scaling based on custom metrics. In this use case, the Prometheus metrics related to RGW exported by the ceph manager is used. The below sample uses ceph_rgw_put metrics and scale up to 5 pods:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rgw-scale-namespace
  namespace: openshift-storage
spec:
  scaleTargetRef:
    kind: Deployment
    name: rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a
  minReplicaCount: 1
  maxReplicaCount: 5
  triggers:
    - type: prometheus
      metadata:
        serverAddress: https://thanos-queries.openshift-monitoring.svc.cluster.local:9092
        metricName: ceph_rgw_put_collector
        query: |
          sum(rate(ceph_rgw_put[2m]))
        threshold: "50"
        authModes: "bearer"
        namespace: openshift-storage
      authenticationRef:
        name: keda-trigger-auth-prometheus

NOTE: Ensure to create all the resources mentioned in this doc in openshift-storage namespace.

Category
Article Type