Performance impact observed On AMD Zen based systems after Red Hat Enterprise Linux upgrade due to Speculative Return Stack Overflow (SRSO aka INCEPTION) CVE-2023-20569 vulnerability fix

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9
  • AMD Zen microarchitecture, generations 1-4

Issue

Upgrading from a RHEL kernel without SRSO (CVE-2023-20569 aka INCEPTION) updated packages to a kernel that does support SRSO updated packages may result in a performance impact. It has been observed on AMD Zen, generations 1-4. That is all families 0x17 and 0x19. Older processors have not been investigated.

This can happen when upgrading from older 8.x, or 9.x kernels without SRSO vulnerability updated packages to a newer RHEL kernel with SRSO vulnerability updated packages.

Resolution

Red Hat Enterprise Linux Systems booting updated kernels and fixing the flaw will require no additional configuration to apply the fixes. If the fixes must be disabled, it can be done by booting the kernel with the following kernel cmdline option:

spec_rstack_overflow=off

For more details, please see
Red Hat announcement
Content from docs.kernel.org is not included.Upstream Kernel Documentation

Root Cause

Red Hat Enterprise Linux follows the upstream kernel by mitigating SRSO (CVE-2023-20569) security vulnerabilities by default. It is strongly recommended that customers weigh the repercussions of such a decision against their internal security policies prior to disabling the fixes.

The performance impact of the SRSO CVE updated packages mainly affects workloads that spend a substantial amount of time in the kernel space. Red Hat has measured significant drops or increased CPU utilization, especially for Network Performance with small message sizes. File system performance also showed a performance drop in our testing. On the other hand, applications that spend all their time in the user space, like HPC workloads, showed no effect. The impact will vary based on the specific workload/scenario.

It is very hard to predict/quantify the performance impact of the SRSO updated packages. The more time the application spends in the kernel space, the more significant the impact.

Diagnostic Steps

** Mitigation Introduced **
RHEL 8: kernel-4.18.0-513.11.1.el8_9
RHEL 9: kernel-5.14.0-362.13.1.el9_3

Examine the

/sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow

file. If on an AMD Zen[1-4] system it says any of

safe RET
safe RET no microcode
IBPB

and your previous kernel did not have that file, then that could be the reason for a performance slowdown.

Additional information about SRSO aka INCEPTION CVE:
Content from www.amd.com is not included.AMD Product Security Bulletin
Content from www.bleepingcomputer.com is not included.BLEEPINGCOMPUTER article
Content from en.wikipedia.org is not included.The AMD Zen1, Zen2, Zen3, and Zen4 Server processors list

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.