RHSA-2015:1272 Moderate: kernel security, bug fix, and enhancement update

Updated 10 Sept 2015

The kernel packages contain the Linux kernel, the core of any Linux operating system. The kernel handles the basic functions of the operating system: memory allocation, process allocation, device input and output, etc.

This update fixes the following bugs:

Certain versions of the gcc compiler or the code for running virtual machines could, under specific circumstances, generate excessively large stack frames, which could in combination with other operations cause the stack to overflow and the system to terminate unexpectedly. This update expands the kernel stack size to 16 KB to account for higher requirements of some functions, thus preventing the crashes from occurring. (BZ#1045190)
Due to invalid zoning configurations, the user experienced multiple remote port (rport) disconnects. The race condition in the rport deletion code causing this bug has been fixed, and node no longer panics when in recovery. (BZ#1102902)
This update introduces a set of patches with a new VLAN model to conform to upstream standards. In addition, this set of patches fixes other issues such as transmission of Internet Control Message Protocol (ICMP) fragments. (BZ#1135347)
When forwarding a packet, the TCPOPTSTRIP iptables target was using the tcp_hdr() function, which is not always suitable in that path. Consequently, TCPOPTSTRIP was looking at the wrong place in the packet and not matching the options, thus not stripping any. To fix this bug, instead of using tcp_hdr(), TCPOPTSTRIP now uses the TCP header itself to locate the option space, and the options are now stripped properly. (BZ#1135650)
Prior to this update, a freeze or a thaw could race with each other trying to set or clear the PF_FREEZING and PF_FROZEN flags lockless. Consequently, the PF_FROZEN flag in particular could be false-positive, which could trigger BUG_ON in cgroup freezer paths. With this fix, PF_FREEZING is no longer set lockless, and no race condition thus occurs in this situation. (BZ#1144478)
Previously, the kernel initialized the floating-point unit (FPU) state for the signal handler too early, right after the current state was saved for the sigreturn() function. As a consequence, a task could lose its FPU context if the signal delivery failed. The fix ensures that the drop_init_fpu() fuction is only called when the signal is delivered successfully, and FPU context is no longer lost in the described situation. (BZ#1196262)
Due to a malfunction in the EHCI driver stream scheduling logic, stream data transferred from kernel to user space failed under some circumstances. In this case, the cheese application failed trying to start a stream with the following error:

libv4l2: error turning on stream: No space left on device

This update adds a check in the driver for the specific case that caused the bug, and user space tools can now start streams without errors. (BZ#1145805)

On older systems without the QCI instruction, all possible domains are probed via TAPQ instruction. Prior to this update, a specification exception could occur when this instruction was called for probing values greater than 16; for example, during the execution of the "insmod" command or the reset of the AP bus on machines without the QCI instruction (z10, z196, z114). zEC12 and newer systems were not affected. Consequently, loading the z90crypt kernel module caused a panic. Now, the domain checking function has been corrected to limit the allowed range if no QCI information is available. As a result, users are able to successfully load and perform cryptographic functions with the z90crypt device driver. (BZ#1172137)
The kernel source code contained two definitions of the cpu_logical_map() function, which maps logical CPU numbers to physical CPU addresses. When translating the logical CPU number to the corresponding physical CPU number, the kernel used the second definition of cpu_logical_map(), which always used a one-to-one mapping of logical to physical CPU addresses. This mapping was, however, wrong after a reboot, especially if the target CPU was in the "stopped" state. Consequently, the system became unresponsive or showed unexpected latencies. With this update, the second definition of cpu_logical_map() has been removed. As a result, the kernel now correctly translates the CPU number to its physical address, and no unexpected latencies occur in this scenario. (BZ#1180061)
Due to a code simplification in the flow_cache_flush() function, the operating system became very slow when creating or deleting IPsec tunnels. This update makes sure that flow_cache_flush() no longer interrupts every core but just the ones that have cache entries, and thus no longer slows down the system performance. (BZ#1191559)
When one thread tried to clear the PF_USED_MATH flag while at the same time another attempted to flip the PF_SPREAD_PAGE/PF_SPREAD_SLAB flags, a race condition occurred. As a consequence, the system terminated unexpectedly at a later point in time on return to the user space boundary. This fix makes PF_SPREAD_PAGE and PF_SPREAD_SLAB flags atomic, and the system thus no longer crashes in this scenario. (BZ#1045310)
Previously, the hardware always provided complement of the IP pseudo checksum. However, the TCP stack expected the whole packet checksum without pseudo checksum if the CHECKSUM_COMPLETE variable was set. Consequently, the enic driver checksum returned an error when working with Open vSwitch (OVS). With this update, the hardware verifies IP and TCP/UDP header checksum but does not provide payload checksum, and uses CHECKSUM_UNNECESSARY. As a result, enic checksum errors are no longer returned with OVS. (BZ#1115505)
When KVM previously took a page fault with interrupts disabled and the page fault handler attempted to take a lock, Kernel Shared Memory (KSM) sent an inter-processor interrupt (IPI) while taking the same lock. As a consequence, KSM waited for the IPI to be processed while KVM waited for KSM to release the lock before processing the IPI, which led to a deadlock scenario. To fix this bug, operations that can page fault while interrupts are disabled are avoided, and KVM and KSM no longer bring each other to a deadlock. (BZ#1116398)
Previously, the bridge device did not propagate VLAN information to its ports and Generic Receive Offload (GRO) information to devices that sit on top. This resulted in lower receive performance of VLANs over bridge devices because GRO was not enabled. An attempt to resolve this problem was made with BZ 858198 by introducing a patch that allows VLANs to be registered with the participating bridge ports and adds GRO to the bridge device feature set; however, that attempt introduced a number of regressions, which broke the vast majority of stacked setups involving bridge devices and VLANs. This update reverts the patch provided by BZ 858198 and removes support for this capability. (BZ#1121991)
The gfs2_convert utility could previously introduce incorrect values for the ondisk inode di_goal_meta field. As a consequence, the gfs2 kernel returned the EBADSLT error on such inodes and did not allow creation of new files in directories or new blocks in regular files. The fix allows gfs2 to set a sensible goal value if a corrupt one is encountered and to proceed with normal operations. As a result, gfs2 no longer returns EBADSLT and implicitly fixes any corrupt goal values and no longer disrupts normal operations. (BZ#1130684)
Previously, attempts to set up very low transmit interrupt latency on the adapter were incorrect, which could lead to Transmission Control Protocol (TCP) transmit delays on ixgbe adapters, depending on several factors, as transmit interrupts could not be set lower than the default of eight buffered tx frames, rather than the minimum possible one frame. This update removes the restriction of minimum eight buffered frames before transmit when Large Receive Offload (LRO) is disabled and tx-usec set to zero and allows the minimum of one frame to cause a transmit to occur. As a result, transmit delays in this mode are minimized. (BZ#1132267)
If an application has closed a tcp socket and that tcp connection was stalled on zero-window probes, it is now always aborted after both maximum backoff and retransmit timeout have been reached. (BZ#1215924)
Previously, running the clock_gettime() function quickly in a loop could result in a jump back in time. As a consequence, programs could behave unexpectedly when they assumed that clock_getime() returned equal or increasing times in subsequent calls. With this update, if the time delta between calls is negative, the clock is not updated, and subsequent calls to clock_gettime() is guaranteed to return a time greater than or equal to the previous call. (BZ#1140024)
Due to a regression, when large reads which partially extended beyond the end of the underlying device were done, the raw driver returned the EIO error code instead of returning a short read covering the valid part of the device. The underlying source code has been patched, and the raw driver now returns a short read for the remainder of the device. (BZ#1142314)
Prior to this update, cgroup blocked new threads from joining the target threadgroup during cgroup migration, which led to a race against the exec() and exit() functions, and a consequent kernel panic. This bug has been fixed by extending threadgroup locking so that it covers all operations which can alter the threadgroup: fork(), exit(), and exec(), and cgroup migration no longer causes the kernel to panic. (BZ#1169225)
The hrtimer_start() function previously attempted to reinsert a timer which was already defined. As a consequence, the timer node pointed to itself and the rb_insert_color() function entered an infinite loop. This update prevents the hrtimer_enqueue_reprogram() function from racing and makes sure the timer state in remove_hrtimer() is preserved, thus fixing the bug. (BZ#1136958)
NFS root exports with multiple security flavors could previously cause an NFS client to choose the wrong security while establishing new opens or recovering from an error. Consequently, the NFS client entered a loop, and appeared to be unresponsive. With this update, clients no longer combine the SETCLIENTID_CONFIRM and PUTROOTFH operations, as the required security for PUTROOTFH may be different than that required for SETCLIENTID_CONFIRM. Now, the NFS client no longer experiences a soft lockup. (BZ#1143013)
When using 4th Generation Intel Core or Intel Xeo v3 Processor perf counters, such as perf or perftop, spurious Nonmaskable Interrupts (NMIs) were previously received, even under moderate load, filling the logs with NMI messages. In addition, if kdump was configured, a kernel panic occurred and a kernel dump was saved. The perf code has been fixed, and the system no longer panics in this scenario. (BZ#1145027)
Due to a race condition flaw between the sock_queue_err_skb() function and sk_forward_alloc handling in the socket error queue (MSG_ERRQUEUE), the kernel could occasionally, for example when using the Precision Time Protocol (PTP), incorrectly track allocated memory for the error queue, in which case a traceback could occur in the system log. With this update, the race condition flaw has been fixed, and memory integrity is now kept and no such traceback appears in the log. (BZ#1148257)
Previously, the NVMe driver incorrectly set the QUEUE_FLAG_STACKABLE flag indicating it supports request stacking. As a consequence, NVMe failed to load and the operating system terminated unexpectedly. This update deletes the stackable flag, and NVMe now loads successfully when used with the Device Mapper kernel component. (BZ#1155715)
Previously, some KVM hosts were intermittently having problems allocating interrupts to guests after they booted. As a consequence, configuring guests could fail when CIAA/D registers were being accessed. This update removes the read/write operations to the CIAA/D registers and makes use of standard kernel functions for accessing the PCI config space. In addition, the thixgbevf_check_for_bad_vf() function has been moved into the watchdog subtask, which reduces the frequency of the checks. Now, KVM hosts allocate interrupts to guests as expected. (BZ#1156061)
Prior to this update, the tcp_collapse() function could get into an infinite loop when copying socket buffers leading to the operating system becoming unresponsive. This update provides a patch to fix tcp_collapse(), and the operating system no longer hangs in the described situation. (BZ#1156289)
When it was time to shut down the OpenIB module, unbalanced joins and leaves of multicast groups resulted in dangling references. Consequently, the module shutdown sequence became unresponsive waiting for the unbalanced joins and leaves to balance out. This update corrects the unbalanced joins and leaves and also fixes several race conditions resolving with locking improvements to prevent the module from joining the same group twice or failing to leave a group once joined. As a result, the module now shuts down properly and no longer hangs the system on reboot. (BZ#1159925)
When the system recognized an Internet Group Management Protocol (IGMP) or a Multicast Listener Discovery (MLD) query, the kernel assumed the other one was also present, which is not always true. As a consequence, the system could potentially suffer from multicast packet loss if there was just either an IGMP or an MLD querier. To fix this bug, the querier has been split into two distinct timers, one for handling each of the protocols. Now, snooping deactivates more selectively according to the protocol and avoids packet drops. (BZ#1167003)
The USB core uses the "hcpriv" member of the USB request block to determine whether a USB Request Block (URB) is active, but the ehci-hcd driver was not setting this correctly when it queued isochronous URBs. This in combination with a defect in the snd-usb-audio driver could cause URBs to be reused without waiting for them to complete. Consequently, list corruption followed by system freeze or a kernel crash occurred. To fix this problem, the ehci-hcd driver code has been updated to properly set the "hcpriv" variable for isochronous URBs, and the snd-usb-audio driver has been updated to synchronize pending stop operations on an endpoint before continuing with preparing the PCM (Pulse Code Modulation) stream. As a result, list corruption followed by system freeze or a crash no longer occurs. (BZ#1167059)
For private futexes (fast userspace mutexes), the get_futex_key_refs() function previously completed without a memory barrier. Consequently, a race condition with a thread waiting on a futex on another CPU occurred. An upstream patch set has been backported, which resolves the bug by explicitly adding a memory barrier. (BZ#1167405)
When a virtual SCSI disk was added to a virtual Red Hat Enterprise Linux guest running on Microsoft Hyper-V from Windows Server 2008R2 or earlier it appeared as eight different /dev/sd* devices inside the guest, which could lead to a failure of some operations with this disk through any of these devices. A kernel patch has been applied to set proper storvsc driver limits based on the host's Hyper-V version, and added virtual SCSI disks now appear as a single /dev/sd* device within the guest. As a result, operations with the disk succeed without issues. (BZ#1174168)
If the user attempted to apply a firmware update when running the tg3 module provided with Red Hat Enterprise Linux 6.6 kernels, the update always failed. This could leave the Network Interface Controller (NIC) hardware in an unusable state and could even prevent the entire system from booting. The provided patch increases the timeout value for the nvram command execution to a sufficient value and also switches from the "delay" to "sleep" mechanism to avoid soft-lockups. As a result, firmware can be now updated successfully. (BZ#1176230)
When using the fibre channel driver, a race condition in the scsi_remove_target() function occurred during rport deletion. Consequently, the kernel terminated unexpectedly when dereferencing starget->state which contained an invalid address. To fix this bug, upstream reference counting infrastructure has been reverted, and the system thus no longer crashes. (BZ#1168072)
Prior to this update, very high latency was observed for small synchronous writes (O_DSYNC) within a KVM guest. For example, KVM using the virtio-blk block device only achieved 30% of the bare metal performance for synchronous writes to a local disk controller with battery-backed write-back cache. This update introduces a new module parameter for the KVM module which assists with latency-bound workloads. When using this parameter on the host, guests can achieve an improved rate of synchronous writes. (BZ#1185250)
The perf packages have been rebased to align with upstream version 3.16. This update also includes all fixes and enhancements from version 3.13, 3.14, and 3.15. There are a number of modified and added parameters for various perf subcommands, as well as a large number of background enhancements. (BZ#1188336)
Prior to this update, the GFS2 file system's "Splice Read" operation, which is used for functions such as sendfile(), was not properly allocating a required multi-block reservation structure in memory. As a consequence, when the GFS2 block allocator was called to assign blocks of data, it tried to dereference the structure, which resulted in a kernel panic. Now, GFS2's "Splice read" operation has been changed so that it properly allocates the necessary reservation structure in memory prior to calling the block allocator. As a result, sendfile() now works properly for GFS2. (BZ#1193559)
A coding error in the e100 driver update caused improper initialization for certain Physical Layer Interface (PHYs) of the OSI model. As a consequence, while the driver was authenticating a device and setting it up for use during initialization, the device sometimes exhibited RX errors, slow throughput, and so on, mostly with long Unshielded Twisted Pair (UTP) cabling. This update fixes the coding error, and devices now work with long UTP cables as expected. (BZ#1156417)
Red Hat Enterprise Linux 6.6 incorporated a patch to fix a readahead() failure condition where the max_sane_readahead() function returned zero (0) on a CPU whose NUMA node had no local memory (BZ#862177)
Parallel file-extending direct I/O writes could previously race to update the size of the file. If they executed in the out-of-order manner, the file size could move backwards and push a previously completed write beyond the end of the file, causing it to be lost. With this update, file size updates always execute in appropriate order, thus fixing this bug. (BZ#1198440)
With certain SMI VGA cards, when copying to or from the VGA memory, the VGA card on a 64-bit system caused console corruption. A workaround avoiding 64-bit transactions has been implemented, which prevents the console from corruption in the described scenario. (BZ#1132826)
Prior to this update, the "--queue-balance" option did not distribute traffic over multiple queues as the option ignored a request to balance among the given range and only used the first queue number given. As a consequence, the kernel traffic was limited to one queue. The underlying source code has been patched, and the kernel traffic is now balanced within the given range. (BZ#1210697)
Due to a bug in the sysv semaphore code, trying to use a semaphore array while it was being initialized or removed could lead to a kernel oops. This update fixes the race condition causing this bug, and a kernel oops no longer appears when a semaphore is being used while the semaphore array it is in is being initialized or removed. (BZ#1165277)
Due to a race condition, deleting a cgroup while pages belonging to that group were being swapped in could trigger a kernel crash. This update fixes the race condition, and deleting a cgroup is now safe even under heavy swapping. (BZ#1168185)
When the load rose and run queues were busy due to the effects of the enqueue_entity() function, tasks with large sched_entity.vruntime values could previously be prevented from using the CPU time. A patch eliminating the entity_key() function in the sched_fair.c latency value has been backported from upstream, and all tasks are now provided with fair CPU runtime. (BZ#1124603)
Due to bad memory or memory corruption, an isolated BUG_ON(mm->nr_ptes) was sometimes reported, indicating that not all the page tables allocated could be found and freed when the exit_mmap() function cleared the user address space. As a consequence, a kernel panic occurred. To fix this bug, the BUG_ON() function has been replaced by WARN_ON(), which prevents the kernel from panicking in the aforementioned situation. (BZ#1168780)

In addition, this update adds the following enhancement:

Functions that grab locks should not be called in critical tracepoint hooks. For this reason, the provided patch exports the tracing clock functions so that they may be used in tracepoint hooks. (BZ#1212502)

Users of kernel are advised to upgrade to these updated packages, which fix these bugs and add this enhancement. The system must be rebooted for this update to take effect.

[Updated 10 September 2015]
Due to a bug in the driver core, a device driver's request for a deferred probe was ignored. As a consequence, the device was left with no driver attached as the driver core failed to reprobe the device. A set of upstream patches has been backported to fix this bug, and the device is now probed successfully. (BZ#1149614)

Product(s)

Red Hat Enterprise Linux

Components

kernel

Article Type

General