How to update ssh keys after installation in OpenShift 4 ?

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.x

Issue

  • How to update ssh keys for master or worker machines ?
  • How to configure ssh keys post-installation if cluster was installed without ssh keys ?

Resolution

In order to update the SSH keys of the cluster, it is necesary to modify (or create) the proper MachineConfig objects on the cluster. By default, those are 99-worker-ssh for workers and 99-master-ssh and look similar to this:

  • Example MachineConfig (with 2 SSH Keys added) for workers:
# cat 99-worker-ssh.yml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-ssh
spec:
  config:
    ignition:
      version: 3.2.0
    passwd:
      users:
      - name: core
        sshAuthorizedKeys:
        - ssh-rsa XXXXXXX.....
        - ssh-rsa YYYYYYY.....
  extensions: null
  fips: false
  kernelArguments: null
  kernelType: ""
  osImageURL: ""
  • Example MachineConfig (with 2 SSH Keys added) for master nodes:
# cat 99-master-ssh.yml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 99-master-ssh
spec:
  config:
    ignition:
      version: 3.2.0
    passwd:
      users:
      - name: core
        sshAuthorizedKeys:
        - ssh-rsa XXXXXXX.....
        - ssh-rsa YYYYYYY.....
  extensions: null
  fips: false
  kernelArguments: null
  kernelType: ""
  osImageURL: ""

Notes :

  • The sshAuthorizedKeys array must contain all the valid SSH public keys, one at each element (please be careful with YAML syntax). This means that, to add an additional SSH key and retain the current SSH Key, then both the old and new SSH keys must be added into the new MachineConfig.
  • Do not update the name field of the user. The only user currently supported is core as shown in the above example config.
  • Note that, like with any MCO change, the machine-config operator will apply the change and this implies:

Steps:

  • If the cluster was installed with SSH keys, download current SSH MachineConfig objects:

    • For workers: oc get mc 99-worker-ssh -o yaml > 99-worker-ssh.yml
    • For masters: oc get mc 99-master-ssh -o yaml > 99-master-ssh.yml
  • Edit the downloaded file(s) with the desired keys like in the examples. If the cluster wasn't installed with ssh keys, create them like in the examples.

  • Once changes are done, apply the files:

    • For workers:
    oc apply -f 99-worker-ssh.yml
    
    • For masters:
    oc apply -f 99-master-ssh.yml
    
  • Machine Config Operator should start applying the changes shortly. Observe which MachineConfigPools are being updated with oc get mcp and check more details on which node is being updated by following this solution.

  • For more insights on how the change is being applied in a single node, then check the machine config daemon logs:

    oc -n openshift-machine-config-operator logs -c machine-config-daemon $(oc -n openshift-machine-config-operator get pod -l k8s-app=machine-config-daemon --field-selector spec.nodeName=${NODE} -o name) -f 
    

    Where ${NODE} must be replaced with the name of the node. Omit -f if logs are not required to be followed.
    If the update was successfully applied, then logs similar to below lines are expected:

    I0111 19:59:07.360110    7993 update.go:258] SSH Keys reconcilable
    ...
    I0111 19:59:07.371253    7993 update.go:569] Writing SSHKeys at "/home/core/.ssh"
    ...
    I0111 19:59:07.372208    7993 update.go:613] machine-config-daemon initiating reboot: Node will reboot into config worker-96b48815fa067f651fa50541ea6a9b5d
    
  • Once done, if your cluster version is lower than 4.12, check if you are impacted by this known issue.

Root Cause

  • By default, RHCOS contains a single user named core (derived in spirit from CoreOS Container Linux) with optional SSH keys specified at install time.

  • If SSH keys are specified at install time, they are propagated to some default MachineConfig objects that ensure they are configured in the cluster machines.

  • If the right MachineConfig object is edited (or created), the SSH keys are updated for all members of the MachineConfigPool to which the MachineConfig belongs (e.g. all worker machines, all master machines...).

  • For simplicity, this solution works with the default MachineConfig objects that contain the SSH keys for workers and masters as generated by the installer. It is possible (but more complex and has no advantage) to use custom names instead.

Diagnostic Steps

When there is an attemp to connect to node using ssh none of the available keys allow to connect to the cluster or is already know that the key was lost somehow.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.