When to avoid kernel.hung_task_panic parameter ?
Environment
- Red Hat Enterprise Linux 5.5 (kernel-2.6.18-194) or above
- Red Hat Enterprise Linux 6, 7, 8, 9
- D state (uninterruptible sleep) processes
Issue
- When should the 'kernel.hung_task_panic' not be used?
Resolution
In Red Hat Enterprise Linux 5.5 kernel (2.6.18-194), the Detect Hung Task kernel thread (khungtaskd) is added. This functionality is also present in Red Hat Enterprise Linux 6.
The khungtaskd thread provides the ability to detect tasks stuck in state D longer than a specified time period by referencing the kernel.hung_task_timeout_secs sysctl parameter (default is 120 seconds) and sent a message along with a trace to /var/log/messages by default. This allows administrators to see if there could be an issue with these processes. Note though that any process that has been coded to indefinitely sleep within state D, 'TASK_UNINTERRUPTIBLE', would trigger the detection from khungtaskd even if there was no issue with the process (due to it being explicitly coded to sleep in state D, 'TASK_UNINTERRUPTIBLE').
When coding applications against the Linux kernel, having a process sleep indefinitely in state D, 'TASK_UNINTERRUPTIBLE', is not recommended. The key word is 'indefinitely'. This goes against the philosophy and workings of the Linux kernel. The khungtaskd kernel thread was developed for use under the Linux kernel to detect any process in state D for an extended period of time, regardless of the methodology used in coding.
This does not mean that the 'TASK_UNINTERRUPTIBLE' flag should be avoided altogether when coding against the Linux kernel. There are several use cases where a process needs to sleep within 'TASK_UNINTERRUPTIBLE' while it waits for the completion of other key processes. But to have it sleep indefinitely in this state should be avoided, if possible, under the Linux kernel.
If this is not avoidable, then it should be noted that setting the kernel.hung_task_panic parameter is not recommended as this would induce a kernel panic based on a false positive. Processes that are coded with other variants of Unix methodologies may not be compatible with the use of the kernel.hung_task_panic parameter. For this reason, involving the vendor/developer of the process in question is recommended to determine if this is expected behavior (having the process in state D, 'TASK_UNINTERRUPTIBLE', by default).
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.