Login using ssh not working after RHOCP 4.13 upgrade
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.12 (preparing to upgrade to 4.13+)
- 4.13
- 4.14
Issue
-
After upgrading from OpenShift 4.12 to 4.13 it was not possible to ssh into the nodes.
-
Accessing a node using SSH failing with the follwoing message after openshift upgrade:
$ ssh -i sshkey core@10.0.0.1 core@10.0.0.1: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Resolution
There are different reasons that could lead to this same issue after upgrading to OpenShift 4.13:
-
If the file
/etc/ssh/sshd_configwas customized, ensure that the following line is present at the beginning of the file, or better, add the customization to a new file within/etc/ssh/sshd_config.d/and revert the customization in file/etc/ssh/sshd_config:Include /etc/ssh/sshd_config.d/*.conf -
If the issue appeared after applying compliance remediations, it has been reported to Red Hat Engineering and it has been tracked in This content is not included.bug OCPBUGS-18331, and already fixed in OpenShift Compliance operator by errata RHBA-2024:1830.
-
A workaround for older versions of the operator is to create an additional
MachineConfigresource to overwrite thesshdconfiguration generated by the OpenShift Compliance operator to add the following line: -
Check How to add Include statement in the /etc/ssh/sshd_config file
``` Include /etc/ssh/sshd_config.d/*.conf ``` >**Note:** The created `MachineConfig` should be removed after upgrading to a version of the operator that includes the fix.
-
-
If the
sshdconfiguration is not intentionally altered and is correct, make sure the issue is not caused by the use of a RSA key as explained in the solution Failing ssh access to nodes using RSA key after RHOCP 4 upgrade.
Recommendation for preparing the upgrade from 4.12 to 4.13+
If the `/etc/ssh/sshd_config` file was already modified, before upgrading the 4.12 cluster:
-
Create an additional
MachineConfigresource populating the mandatory directory and include file -
Create the
/etc/ssh/sshd_config.ddirectory$ mkdir /etc/ssh/sshd_config.d -
Create one file by touching an empty
*.confto be found by glob$ touch /etc/ssh/sshd_config.d/empty_include.conf
Root Cause
In OpenShift 4.13 the location for ssh keys changed, as reported by the This page is not included, but the link has been rewritten to point to the nearest parent document.release notes. By default the following sshd configuration is present for retrieving users keys starting with that version:
AuthorizedKeysCommand /usr/libexec/ssh-key-dir %u
In OpenShift 4.12 the /etc/ssh/sshd_config.d directory is absent. Applying the steps above will lead to Segmentation fault when connecting to the ssh service. An emtpy .conf is mandatory as the Segementation fault would still occure without.
Diagnostic Steps
Access the nodes using oc debug and confirm the line Include /etc/ssh/sshd_config.d/*.conf is not present in the sshd configuration:
$ for NODE in $(oc get nodes -o custom-columns=:metadata.name); do echo ---- $NODE ----; oc debug -q node/${NODE} -- chroot /host /bin/bash -c "grep Include /etc/ssh/sshd_config"; echo; done
---- master-0 ----
---- master-1 ----
---- master-2 ----
[...]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.