About the OpenShift 4 kubeconfig file for system:admin
Environment
- Red Hat OpenShift Container Platform 4
Issue
- Where is the
system:adminkubeconfigfile? - How do you recover when an Identity Provider is misconfigured,
oauthis down and users are unable to login? - All masters were made unschedulable. Now all users are unable to login because the
oauthpodscannot be scheduled. - How can an RHOCP cluster be accessed if the ingress controllers are down and the oauth is not accessible?
- The
oauthpodis unresponsive or not reachable. - Normal users get
Unable to connect to the server: EOFatoauth.
Resolution
There are several methods to get a special kubeconfig for user system:admin (which has cluster-admin privileges) that uses client certificates instead of oauth tokens.
Admin kubeconfig generated by the installer
The system:admin user kubeconfig is available in the installation directory on the bastion host from where the installation was performed):
$INSTALL_DIR/auth/kubeconfig
It can be used to operate the cluster as follows:
oc --kubeconfig=$INSTALL_DIR/auth/kubeconfig get nodes
Or exporting the path to the KUBECONFIG env var:
export KUBECONFIG=$INSTALL_DIR/auth/kubeconfig
oc whoami
oc get nodes
Important: Don't execute oc login when using the kubeconfig file, as it will switch them from using the special system:admin user (which uses client certificates) to token-based authentication using the credentials specified.
When using the kubeconfig from the install directory always ensure that the current context is set properly, use the following command and ensure that the current context is set to the admin(which represents the kubeconfig file) below is the example output for admin kubeconfig in use:
$ oc config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* admin sharedocp415 admin
default/api-example-redhat-com:6443/admin api-example-redhat-com:6443 admin/api-example-redhat-com:6443 default
Additionally check it with the whoami command and ensure output shows system:admin as follow:
$ oc whoami
system:admin
If the current context is not pointing to the admin kubeconfig as mentioned above, switch it back using the below command:
$ oc config use-context admin
If this admin kubeconfig gets damaged, but installation directory is available, it can be recovered from .openshift_install_state.json by following this solution.
Or we can regenerate admin kubeconfig using the solution How to regenerate the admin kubeconfig file in OpenShift 4.
Admin kubeconfigs at control planes
If the files in the installation directory are no longer available, the kubeconfig file can be retrieved from any of the master nodes by following this procedure (available for OpenShift 4.6 or higher):
The directory /etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/ in all master nodes contains different system:admin kubeconfigs, which can be used to access different API endpoints:
lb-ext.kubeconfig: It points to external API load balancer (api.<cluster-domain>)lb-int.kubeconfig: It points to internal API load balancer (api-int.<cluster-domain>)localhost.kubeconfig: It points to localhost. This one is useful in case of problems with the load balancers, for example.localhost-recovery.kubeconfig: It points to localhost but sends thelocalhost-recoverySNI name, which in turns causes the special server certificates used by the auto-recovery logic to be used. This one may be useful ifkube-apiserverserving certificates are expired and the auto-recovery logic has failed.
To access these config files, ssh to any master node, become root, and use scp to copy them to another host.
In addition, it's also possible to extract these kubeconfigs, provided you can still authenticate to the cluster:
oc -n openshift-kube-apiserver extract secret/node-kubeconfigs
Note: the
localhostconfig files will not work if used on hosts other than the master nodes.
Provide your own client CA
It is possible to add custom client CAs to kube-apiserver, so the master nodes will accept any client certificate signed by them.
The additional user CAs can be configured as per this procedure. Note that it must be run when the cluster is healthy.
Once done, create and sign a client certificate for O=system:masters,CN=system:admin and it will be accepted by the kube-apiserver.
Root Cause
The system:admin user is assigned cluster-admin clusterrole by the default policy. It is usually used for TLS client cert authentication, so it works when the oauth pods are down, inaccessible or having problems.
As described in this solution, there are several ways to have a kubeconfig for the system:admin user that uses TLS client certificate.
In what regards the installer-generated admin kubeconfig, it is strongly advised to back it up along with the entire $INSTALL_DIR. This kubeconfig can be a last resort to access the OpenShift cluster as an admin and has the advantage of very long validity (10 years), as well as not requiring any preparation step to be usable (it is usable right from install).
Besides, in OpenShift 4.6 and higher, the Cluster Kube-Apiserver Operator creates certificate-based system:admin kubeconfigs and delivers them to all master nodes, so that they always have cluster-admin kubeconfigs valid for the cluster by using different endpoints. This mimics the old RHOCP3 behavior of having system:admin kubeconfigs for root users at masters, but with a bit more granularity. The certificates used by these kubeconfigs files are created with 120 days validity and refreshed every 30 days, so that certificates are valid for at least 60 days in case one of the refreshes fails.
Diagnostic Steps
Just for the curious, the way that the admin kubeconfigs at masters are created is :
- openshift-kube-apiserver-operator creates and maintains a
node-system-admin-signerinternal CA (at secretnode-system-admin-signerofopenshift-kube-apiserver-operatornamespace). - This CA is added to the acceptable client certificate CAs (at configmap
client-caofopenshift-kube-apiservernamespace). - openshift-kube-apiserver-operator uses this CA to create and maintain (as per indicated rotation period) a
system:adminclient certificate (at secretnode-system-admin-clientofopenshift-kube-apiserver-operatornamespace). - Then, openshift-kube-apiserver-operator creates and maintains a secret with the admin kubeconfigs (the
node-kubeconfigssecret ofopenshift-kube-apiservernamespace referred at the extract command above) - Finally, the contents of this secret are downloaded as static pod resources when the kube-apiserver static pods are rolled out. This is what finally places them in their
/etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/node-kubeconfigs/location.
Also for the curious, the way that the acceptable client CAs is built can be seen in the code of cluster-kube-apiserver-operator Content from github.com is not included.at ManageClientCABundle method of the target config controller. The comments describe what is each CA.
And a last curiosity about the installer admin kubeconfigs: The CA used by them is stored at admin-kubeconfig-client-ca and its private key is never stored in the cluster (only in the .openshift_install_state.json file of installation directory). If strictly needed, it can be replaced as per this solution, but this procedure should not be followed unless there is no alternative.
Note all these processes are automated and all these descriptions are provided only for your knowledge (and curiosity). These are deep internals of the cluster operators that are subject to heavy change. Hence, manually modifying or messing with some of these secrets or configmaps or any other similar interference may result in unsupported scenarios.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.