CPU "model 79" systems hangs/panics during boot following an update to the microcode_ctl package

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL) 7
  • microcode_ctl-2.1-29.10.el7_5.x86_64
  • microcode_ctl-2.1-29.el7.x86_64
  • microcode_ctl-2.1-22.5.el7_4.x86_64
  • microcode_ctl-2.1-22.2.el7.x86_64
  • microcode_ctl-2.1-22.el7.x86_64
  • microcode_ctl-2.1-16.13.el7_3.x86_64
  • microcode_ctl-2.1-16.5.el7_3.x86_64
  • microcode_ctl-2.1-16.4.el7_3.x86_64
  • microcode_ctl-2.1-16.3.el7_3.x86_64
  • microcode_ctl-2.1-16.1.el7_3.x86_64
  • microcode_ctl-2.1-16.el7.x86_64
  • microcode_ctl-2.1-12.el7_2.2.x86_64
  • Red Hat Enterprise Linux (RHEL) 6
  • microcode_ctl-1.17-33.3.el6_10
  • microcode_ctl-1.17-25.4.el6_9
  • microcode_ctl-1.17-25.2.el6_9
  • microcode_ctl-1.17-25.el6
  • microcode_ctl-1.17-20.8.el6_7
  • microcode_ctl-1.17-19.8.el6_6
  • microcode_ctl-1.17-17.10.el6_5
  • microcode_ctl-1.17-16.8.el6_4
  • Intel model 79 CPUs. The following models have been reported to experience the issue (not inclusive list)

    Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz
    Intel(R) Xeon(R) CPU E5-2643 v4 @ 3.40GHz
    Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
    Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.50GHz
    

Issue

  • After applying patches for CVE-2017-5754 CVE-2017-5753 and CVE-2017-5715, system poweroff during boot

  • After upgrade to microcode_ctl-1.17-25.2, the system is powered off during boot.

  • After update, the system hangs during boot with the following messages on the console:

    microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001f
    platform microcode: firmware: requesting intel-ucode/06-4f-01
    

Resolution

Required steps

Update the microcode_ctl package to a revision in the applicable release stream that is newer than the instances shown in the Environment section above.

Note: The above guidance avoids the system instability referred to in this article. However, this is accomplished as updated microcode_ctl revisions no longer loading the applicable microcode for model 79 CPUs. Should end users require the microcode to be loaded, and the target systems have been shown to not be susceptible to the same instability, please see the "Additional Information" section below.


Additional Information

Due to certain CPU models being rendered unstable during a live microcode update, the microcode_ctl package will no longer attempt to update the microcode for certain model 79 CPUs. It is possible to enable the microcode update per the README documentation provided with the microcode_ctl package.

/usr/share/microcode_ctl/ucode_with_caveats/intel-06-4f-01/readme

Intel Broadwell-EP/EX (BDX-ML B/M/R0, family 6, model 79, stepping 1) has issues
with microcode update that may lead to a system hang; while some changes
to the Linux kernel have been made in an attempt to address these issues,
they were not eliminated, so a possibility of unstable system behaviour
after a microcode update performed on a running system is still present even
on a kernels that contain aforementioned changes.  As a result, microcode update
for this CPU model has been disabled by default.

For the reference, kernel versions for the respective RHEL minor versions
that contain the aforementioned changes, are listed below:
 * Upstream/RHEL 8: kernel-4.17.0 or newer;
 * RHEL 7.6 onwards: kernel-3.10.0-894 or newer;
 * RHEL 7.5.z: kernel-3.10.0-862.6.1 or newer;
 * RHEL 7.4.z: kernel-3.10.0-693.35.1 or newer;
 * RHEL 7.3.z: kernel-3.10.0-514.52.1 or newer;
 * RHEL 7.2.z: kernel-3.10.0-327.70.1 or newer.

Please contact you system vendor for a BIOS/firmware update that contains
the latest microcode version. For the information regarding microcode versions
required for mitigating specific side-channel cache attacks, please refer
to the following knowledge base articles:
 * CVE-2017-5715 ("Spectre"):
   https://access.redhat.com/articles/3436091
 * CVE-2018-3639 ("Speculative Store Bypass"):
   https://access.redhat.com/articles/3540901
 * CVE-2018-3620, CVE-2018-3646 ("L1 Terminal Fault Attack"):
   https://access.redhat.com/articles/3562741
 * CVE-2018-12130, CVE-2018-12126, CVE-2018-12127, and CVE-2019-11091
   ("Microarchitectural Data Sampling"):
   https://access.redhat.com/articles/4138151

The information regarding enforcing microcode load is provided below.

For enforcing addition of this microcode to the firmware directory
for a specific kernel, where it is available for a late microcode update,
please create a file "force-late-intel-06-4f-01" inside
/lib/firmware/<kernel_version> directory and run
"/usr/libexec/microcode_ctl/update_ucode":

    touch /lib/firmware/3.10.0-862.9.1/force-late-intel-06-4f-01
    /usr/libexec/microcode_ctl/update_ucode

After that, it is possible to perform a late microcode update by executing
"/usr/libexec/microcode_ctl/reload_microcode" or by writing value "1" to
"/sys/devices/system/cpu/microcode/reload" directly.

For enforcing addition of this microcode to firmware directories for all
kernels, please create a file
"/etc/microcode_ctl/ucode_with_caveats/force-late-intel-06-4f-01"
and run "/usr/libexec/microcode_ctl/update_ucode":

    touch /etc/microcode_ctl/ucode_with_caveats/force-late-intel-06-4f-01
    /usr/libexec/microcode_ctl/update_ucode

For enforcing early load of this microcode for a specific kernel, please
create a file "force-early-intel-06-4f-01" inside
"/lib/firmware/<kernel_version>" directory and run
"dracut -f --kver <kernel_version>":

    touch /lib/firmware/3.10.0-862.9.1/force-early-intel-06-4f-01
    dracut -f --kver 3.10.0-862.9.1

For enforcing early load of this microcode for all kernels, please
create a file "/etc/microcode_ctl/ucode_with_caveats/force-early-intel-06-4f-01"
and run dracut -f --regenerate-all:

    touch /etc/microcode_ctl/ucode_with_caveats/force-early-intel-06-4f-01
    dracut -f --regenerate-all

If you want avoid removal of the microcode file during cleanup performed by
/usr/libexec/microcode_ctl/update_ucode, please remove the corresponding readme
file (/lib/firmware/<kernel_version>/readme-intel-06-4f-01).


Please refer to /usr/share/doc/microcode_ctl/README.caveats for additional
information.

In the event that enabling the microcode update process results in a system reset or hang, it will be required that the system firmware be updated to allow the subsequent microcode_ctl updates to succeed.

Additionally, if a hang is encountered on a RHEL 7.2 or newer system it is possible to pass a kernel command line parameter, dis_ucode_ldr, that will skip the microcode update process entirely. This kernel command line parameter is only available on RHEL 7.

Please see the following articles for further details:

Is CPU microcode available to address CVE-2017-5715 via the microcode_ctl package?
Is CPU microcode available to address CVE-2018-3620 and CVE-2018-3646 via the microcode_ctl package?
Is CPU microcode available to address CVE-2018-3639 via the microcode_ctl package?

Root Cause

During update of the microcode_ctl package, provided udev-rules cause the 06-4f-01 microcode to be automatically loaded on model 79 CPUs. In the event that the udev mechanism does not result in an immediate microcode load, subsequent boot iterations can result in the microcode load operation taking place.

In either case, for model 79 CPUs, system instability can result from microcode load operations. Of note, the microcode is loaded during each boot operation; however, it is only applied in the event that the microcode available within /lib/firmware/ for the installed CPU is newer than the revision loaded during the hardware initialization phase of boot. Updating the system firmware to a revision that includes updated microcode is applicable to any resident software, and is recommended as a more permanent solution.

Please contact your hardware vendor to determine whether more recent BIOS/firmware updates are recommended, as additional improvements may be available.

SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.