Hrtimers may expire early when a leap second is inserted

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7 (RHEL 7)
  • Red Hat Enterprise Linux for Real Time (RHEL-RT)
  • Red Hat Enterprise MRG Realtime (MRG-RT)
  • Red Hat Enterprise Virtualization Hypervisor 3.6 (RHEV-H)
  • Red Hat Virtualization Hypervisor 4 (RHV-H)
  • System clock keeping time in UTC or localtime, both are potentially affected

Issue

  • Hrtimers may expire early when a leap second is inserted. Depending on what the specific hrtimer was doing dictates what will occur when this bug is hit. The application may crash, the server's CPUs may spike to 100% usage, or the entire server may crash.
  • After the leap second is inserted the CPU usage goes to 100% and stays there.

Resolution

  • If the issue has already occurred, it is necessary to restart the affected process(es), or reboot the server, to resume normal CPU usage.
  • Update to one of the following kernels to prevent this issue from occurring:
    • 7.2: kernel-3.10.0-327.41.4.el7
    • 7.3: kernel-3.10.0-514.2.2.el7
    • RHEL-RT: kernel-rt-3.10.0-514.2.2.rt56.424.el7
    • MRG-RT: kernel-rt-3.10.0-327.rt56.199.el6rt
  • RHEV-H and RHV-H customers should upgrade to one of the following to prevent this issue from occurring:
    • RHEV-H: rhev-hypervisor7-7.2-20161220.0.el6ev
    • RHV-H: redhat-virtualization-host-4.0-20161220.0.el7_3
  • The following kpatches have also been released to address this issue. To obtain kmod, so that the kernels may be patched, open a Support Case referencing this article and request kmod. The full instructions for utilizing kmod are found in Is live kernel patching (kpatch) supported in RHEL 7? and A Guide to kpatch on Red Hat Enterprise Linux 7.2 and Later.
    • 7.3.0: kernel-3.10.0-514.el7
    • 7.2.8: kernel-3.10.0-327.41.3.el7
    • 7.2.7: kernel-3.10.0-327.36.1.el7
    • 7.2.6: kernel-3.10.0-327.28.2.el7
    • 7.2.5: kernel-3.10.0-327.22.2.el7
    • async: kernel 3.10.0-327.28.3.el7
    • async: kernel 3.10.0-327.36.2.el7
    • async: kernel 3.10.0-327.36.3.el7

Note: KPatch is still restricted to Premium Entitlements for reactive requests (bug fixes). However, Standard Entitlements can utilize kpatch for certain high profile issues that we try to address proactively, such as this leap second issue.

Workarounds

If this issue happens, it is caused by the insertion of a leapsecond and the kernel handling this event. Instead of fixing the kernel handler for the event (in installing a fixed kernel, or using kpatch) any of the following workarounds may also be considered. Note that any individual item will address this issue:

  • As the issue occurs with the kernel code stepping the clock through the leapsecond, this stepping could be handed over to software, for example by using chrony with the leapsecmode step configuration.
  • The ntpd, or chrony, client on the client may stay running and receive the leapsecond information from their sources, but be instructed to not insert the leapsecond via kernel immediately. Instead, these daemons may be configured to slowly step or slew the time, introducing the leapsecond over a longer period. When using slew, the systems time will be different from the official time until the slew completes.
  • The upstream NTP server where the affected system receives its time signal from can be instructed to not hand down the leapsecond information, but instead slowly introduce the leapsecond over a longer time. Currently only the smear functionality in chrony supports this option.
  • The system could also be configured to ignore the leap second by stopping daemons while the upstream servers make the announcement. This behavior is accomplished by running without ntpd, or another daemon, active on the server. If this workaround is chosen it is recommended to run with the latest tzdata package installed, and configure the system to use a right/* timezone. This option is not recommended for servers that require in-sync communication, or for applications that expect UTC time.

Root Cause

When the hrtimer expires during the leap second insertion there is approximately a 1 in 500 chance that the hrtimer re-arms itself for the second when the leap second will be inserted. However, it may fall into a loop where the hrtimer expires and re-arms itself infinitely. This behavior typically results in CPUs being fully consumed due to the hrtimers rapidly firing and expiring, but may have a wide spread of effects as it is entirely dependent on what the particular hrtimer was doing. Likely hood of hitting this issue is much higher for a multi-threaded application.

This behavior occurred because the time was copied in update_wall_time() before bookkeeping occurred, and the additional bookkeeping is not all read-only, causing possible issues in the time's state as the copied values were overwritten.

The following shows the faulty ordering from RHEL 7.2's update_wall_time():

memcpy(real_tk, tk, sizeof(*tk));                      <-- time copied
timekeeping_update(real_tk, clock_set);                <-- additional bookkeeping occurring, overwriting previous values
write_seqcount_end(&timekeeper_seq);

Diagnostic Steps

There are multiple ways this issue may be noticed; the key is finding hrtimer calls in the backtrace. The Leap Second Issue Detector may be used to determine if a system may encounter this issue; however, there is no guarantee a vulnerable system will experience this particular issue.

The following are possibilities of methods that may be used to determine if this issue was encountered:


This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.