How to update ssh keys after installation in OpenShift 4 ?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.x
Issue
- How to update ssh keys for master or worker machines ?
- How to configure ssh keys post-installation if cluster was installed without ssh keys ?
Resolution
In order to update the SSH keys of the cluster, it is necesary to modify (or create) the proper MachineConfig objects on the cluster. By default, those are 99-worker-ssh for workers and 99-master-ssh and look similar to this:
- Example MachineConfig (with 2 SSH Keys added) for workers:
# cat 99-worker-ssh.yml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-ssh
spec:
config:
ignition:
version: 3.2.0
passwd:
users:
- name: core
sshAuthorizedKeys:
- ssh-rsa XXXXXXX.....
- ssh-rsa YYYYYYY.....
extensions: null
fips: false
kernelArguments: null
kernelType: ""
osImageURL: ""
- Example MachineConfig (with 2 SSH Keys added) for master nodes:
# cat 99-master-ssh.yml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-master-ssh
spec:
config:
ignition:
version: 3.2.0
passwd:
users:
- name: core
sshAuthorizedKeys:
- ssh-rsa XXXXXXX.....
- ssh-rsa YYYYYYY.....
extensions: null
fips: false
kernelArguments: null
kernelType: ""
osImageURL: ""
Notes :
- The
sshAuthorizedKeysarray must contain all the valid SSH public keys, one at each element (please be careful with YAML syntax). This means that, to add an additional SSH key and retain the current SSH Key, then both the old and new SSH keys must be added into the new MachineConfig. - Do not update the
namefield of the user. The only user currently supported iscoreas shown in the above example config. - Note that, like with any MCO change, the machine-config operator will apply the change and this implies:
- Up to OpenShift Container Platform 4.6 all the nodes will drain->reboot one by one (as
maxUnavailablesetting onMachineConfigPool). - Starting from OpenShift Container Platform 4.7 this change will not trigger a drain->reboot as there was a Content from github.com is not included.change on the machine-config operator that improve the change process and avoid this.
- Up to OpenShift Container Platform 4.6 all the nodes will drain->reboot one by one (as
Steps:
-
If the cluster was installed with SSH keys, download current SSH
MachineConfigobjects:- For workers:
oc get mc 99-worker-ssh -o yaml > 99-worker-ssh.yml - For masters:
oc get mc 99-master-ssh -o yaml > 99-master-ssh.yml
- For workers:
-
Edit the downloaded file(s) with the desired keys like in the examples. If the cluster wasn't installed with ssh keys, create them like in the examples.
-
Once changes are done, apply the files:
- For workers:
oc apply -f 99-worker-ssh.yml- For masters:
oc apply -f 99-master-ssh.yml -
Machine Config Operator should start applying the changes shortly. Observe which
MachineConfigPools are being updated withoc get mcpand check more details on which node is being updated by following this solution. -
For more insights on how the change is being applied in a single node, then check the machine config daemon logs:
oc -n openshift-machine-config-operator logs -c machine-config-daemon $(oc -n openshift-machine-config-operator get pod -l k8s-app=machine-config-daemon --field-selector spec.nodeName=${NODE} -o name) -fWhere
${NODE}must be replaced with the name of the node. Omit-fif logs are not required to be followed.
If the update was successfully applied, then logs similar to below lines are expected:I0111 19:59:07.360110 7993 update.go:258] SSH Keys reconcilable ... I0111 19:59:07.371253 7993 update.go:569] Writing SSHKeys at "/home/core/.ssh" ... I0111 19:59:07.372208 7993 update.go:613] machine-config-daemon initiating reboot: Node will reboot into config worker-96b48815fa067f651fa50541ea6a9b5d -
Once done, if your cluster version is lower than 4.12, check if you are impacted by this known issue.
Root Cause
-
By default, RHCOS contains a single user named core (derived in spirit from CoreOS Container Linux) with optional SSH keys specified at install time.
-
If SSH keys are specified at install time, they are propagated to some default
MachineConfigobjects that ensure they are configured in the cluster machines. -
If the right
MachineConfigobject is edited (or created), the SSH keys are updated for all members of theMachineConfigPoolto which theMachineConfigbelongs (e.g. allworkermachines, allmastermachines...). -
For simplicity, this solution works with the default
MachineConfigobjects that contain the SSH keys for workers and masters as generated by the installer. It is possible (but more complex and has no advantage) to use custom names instead.
Diagnostic Steps
When there is an attemp to connect to node using ssh none of the available keys allow to connect to the cluster or is already know that the key was lost somehow.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.