error: x509 certificate signed by unknown authority when logging in Prometheus/Grafana/Jaeger after replacing ingress certificate and/or API certificate

Solution Verified - Updated

Environment

  • Red Hat OpenShift
    • 4
  • Red Hat Service Mesh
    • 1.0
    • 1.1

Issue

  • Ingress/API certificate has been replaced and now Prometheus, Grafana and Jaeger UIs doesn't work after entering credentials with an error 500.
Error 500 in the browser after entering credentials
Error 500 in the browser after entering credentials

Resolution

  • Scale down the two operators which will delete the upcoming changes:

    $ oc scale --replicas 0 -n openshift-operators deployment/istio-operator
    $ oc scale --replicas 0 -n openshift-operators deployment/jaeger-operator
    
  • Create a configMap in istio-system which will be filled with the CA bundle:

    $ oc create configmap trusted-ca-bundle -n istio-system
    $ oc label configmap/trusted-ca-bundle -n istio-system "config.openshift.io/inject-trusted-cabundle=true"
    
  • Edit Prometheus deployment to mount the configMap in both containers:

    $ oc patch deployment/prometheus -n istio-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"prometheus-proxy","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]},{"name":"prometheus","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]}],"volumes":[{"configMap":{"defaultMode":420,"items":[{"key":"ca-bundle.crt","path":"tls-ca-bundle.pem"}],"name":"trusted-ca-bundle","optional":true},"name":"trusted-ca-bundle"}]}}}}'
    
  • Edit Grafana deployment in the same way:

    $ oc patch deployment/grafana -n istio-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"grafana-proxy","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]},{"name":"grafana","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]}],"volumes":[{"configMap":{"defaultMode":420,"items":[{"key":"ca-bundle.crt","path":"tls-ca-bundle.pem"}],"name":"trusted-ca-bundle","optional":true},"name":"trusted-ca-bundle"}]}}}}'
    
  • Edit Jaeger deployment in the same way:

    $ oc patch deployment/jaeger -n istio-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"oauth-proxy","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]},{"name":"jaeger","volumeMounts":[{"mountPath":"/etc/pki/ca-trust/extracted/pem/","name":"trusted-ca-bundle","readOnly":true}]}],"volumes":[{"configMap":{"defaultMode":420,"items":[{"key":"ca-bundle.crt","path":"tls-ca-bundle.pem"}],"name":"trusted-ca-bundle","optional":true},"name":"trusted-ca-bundle"}]}}}}'
    
  • After the last deployment finishes, check all the routes and login using valid OpenShift credentials.

Note: There were recent scenarios, mainly 4.3.5 version, where the operator replicas were managed through the Operator LifeCycle Manager (OLM) and their ClusterServiceVersion (CSV). In this case, you will have to scale down the replicas in the CSV definition on each operator

Root Cause

  • The pods deployed are not aware of the new CA included in the cluster wide proxy, so unless the CA which issued the Ingress/API certificate is a well known CA and it's already included in the default CA bundle in the container images the connection will fail.
  • Until a new operator version includes this mounts by default, the documented procedure can be used in order to inject the CA bundle.
  • IMPORTANT: This solution will set your deployment in a limited support scenario as two operators will be disabled in order to persist the changes, therefore no automatic updates will occur nor self-healing will occur for the stack.

Diagnostic Steps

  • The login fails after entering the credentials

  • In all three proxy containers the error can be seen after a failed login:

      $ oc logs -n istio-system grafana-76d6c94b9f-2trwh -c grafana-proxy
      2020/03/11 11:36:36 oauthproxy.go:645: error redeeming code (client:10.128.2.26:59268): Post https://oauth-openshift.apps.cluster.domain.tld/oauth/token: x509: certificate signed by unknown authority
      2020/03/11 11:36:36 oauthproxy.go:438: ErrorPage 500 Internal Error Internal Error
    
      $ oc logs -n istio-system prometheus-c7d8d58d4-8fl8k -c prometheus-proxy
      2020/03/11 11:39:11 oauthproxy.go:645: error redeeming code (client:10.128.2.26:32938): Post https://oauth-openshift.apps.cluster.domain.tld/oauth/token: x509: certificate signed by unknown authority
      2020/03/11 11:39:11 oauthproxy.go:438: ErrorPage 500 Internal Error Internal Error
    
      $ oc logs jaeger-b95c8f846-8t2qt -c oauth-proxy
      2020/03/11 11:40:45 oauthproxy.go:645: error redeeming code (client:10.128.2.26:50712): Post https://oauth-openshift.apps.cluster.domain.tld/oauth/token: x509: certificate signed by unknown authority
      2020/03/11 11:40:45 oauthproxy.go:438: ErrorPage 500 Internal Error Internal Error
    
SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.