The IO time out values set in IBM lin_tape driver causes hung task panic
Environment
- Red Hat Enterprise Linux 6, 7, 8, 9
- IBM lin_tape driver
Issue
-
A server using the IBM lin_tape driver experienced a crash due to triggering the hung task (i.e. khungtaskd) panic condition.
INFO: task dsmserv:3563 blocked for more than 500 seconds. Not tainted 2.6.32-504.16.2.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. dsmserv D 0000000000000000 0 3563 5207 0x00000080 ffff8809503c77e8 0000000000000082 0000000000011220 ffff8809503c7788 ffff8809503c77a0 ffff881074fddc70 ffff8809503c7758 ffffffff81126b95 ffff8809503c77e8 ffffffff81126d33 ffff880ce081f068 ffff8809503c7fd8 Call Trace: [<ffffffff81126b95>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff81126d33>] ? mempool_alloc+0x63/0x140 [<ffffffff81380069>] ? scsi_get_command+0x89/0xc0 [<ffffffff8152ad25>] schedule_timeout+0x215/0x2e0 [<ffffffff8128e1e7>] ? kobject_put+0x27/0x60 [<ffffffff8152a9a3>] wait_for_common+0x123/0x180 [<ffffffff81064bc0>] ? default_wake_function+0x0/0x20 [<ffffffff8152aabd>] wait_for_completion+0x1d/0x20 [<ffffffff8127791c>] blk_execute_rq+0x8c/0xf0 [<ffffffff812714d0>] ? freed_request+0x50/0x80 [<ffffffff812720e6>] ? blk_get_request+0x46/0xa0 [<ffffffffa0514d98>] lin_tape_blk_rq+0x138/0x590 [lin_tape] [<ffffffffa051523a>] tape_execute_scsi_command+0x4a/0xc0 [lin_tape] [<ffffffffa0514192>] ? tape_log_cdb+0x32/0x110 [lin_tape] [<ffffffffa051a65f>] tape_send_scsi_io+0x1ff/0x4a0 [lin_tape] [<ffffffff810b2a3d>] ? get_futex_key+0x18d/0x2d0 [<ffffffffa051a97b>] tape_send_scsi_cmd+0x7b/0x2d0 [lin_tape] [<ffffffff810b231a>] ? futex_wait_queue_me+0xba/0xf0 [<ffffffffa051fd8f>] tape_set_pos+0x23f/0x520 [lin_tape] [<ffffffffa0503ae6>] stiocsetpos+0x16/0x20 [lin_tape] [<ffffffffa0501214>] drvioc_exe+0xd4/0x150 [lin_tape] [<ffffffffa050e5fd>] lin_tape_drive_ioctl+0x50d/0x1990 [lin_tape] [<ffffffffa0503ad0>] ? stiocsetpos+0x0/0x20 [lin_tape] [<ffffffffa04f8339>] lin_tape_ioctl_drive+0x169/0x260 [lin_tape] [<ffffffffa04f84cd>] lin_tape_ioctl+0x9d/0x100 [lin_tape] [<ffffffff811a3a9c>] vfs_ioctl+0x7c/0xa0 [<ffffffff811a3be4>] do_vfs_ioctl+0x84/0x580 [<ffffffff81194123>] ? sys_newfstat+0x33/0x40 [<ffffffff811a4161>] sys_ioctl+0x81/0xa0 [<ffffffff810e5b6e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Kernel panic - not syncing: hung_task: blocked tasks Pid: 179, comm: khungtaskd Not tainted 2.6.32-504.16.2.el6.x86_64 #1 Call Trace: [<ffffffff8152971c>] ? panic+0xa7/0x16f [<ffffffff810ea046>] ? watchdog+0x246/0x250 [<ffffffff810e9e00>] ? watchdog+0x0/0x250 [<ffffffff8109e71e>] ? kthread+0x9e/0xc0 [<ffffffff8100c20a>] ? child_rip+0xa/0x20 [<ffffffff8109e680>] ? kthread+0x0/0xc0 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Resolution
-
Due to the large IO time out values set by the IBM lin_tape driver it is recommended to disable
kernel.hung_task_panic -
Edit the
/etc/sysctl.conffile and modify thekernel.hung_task_panicvalue to zerokernel.hung_task_panic = 0
Root Cause
I/O requests time out values from IBM lin_tape driver can be set to value higher than kernel.hung_task_timeout_secs, which can trigger the hung task panic condition when the kernel.hung_task_panic is enabled.
Diagnostic Steps
-
Obtain the vmcore
crash> ps -m | grep UN [0 00:00:00.006] [UN] PID: 4474 TASK: ffff880a4cf1e040 CPU: 5 COMMAND: "dsmserv" [0 00:00:00.025] [UN] PID: 11084 TASK: ffff880100e35520 CPU: 5 COMMAND: "dsmserv" [0 00:00:00.962] [UN] PID: 6693 TASK: ffff880079cb1520 CPU: 6 COMMAND: "dsmserv" [0 00:08:31.626] [UN] PID: 3563 TASK: ffff880ce081eab0 CPU: 0 COMMAND: "dsmserv" crash> bt 3563 PID: 3563 TASK: ffff880ce081eab0 CPU: 0 COMMAND: "dsmserv" #0 [ffff8809503c7718] schedule at ffffffff81529dd0 #1 [ffff8809503c77f0] schedule_timeout at ffffffff8152ad25 #2 [ffff8809503c78a0] wait_for_common at ffffffff8152a9a3 #3 [ffff8809503c7930] wait_for_completion at ffffffff8152aabd #4 [ffff8809503c7940] blk_execute_rq at ffffffff8127791c #5 [ffff8809503c79f0] lin_tape_blk_rq at ffffffffa0514d98 [lin_tape] #6 [ffff8809503c7a90] tape_execute_scsi_command at ffffffffa051523a [lin_tape] #7 [ffff8809503c7af0] tape_send_scsi_io at ffffffffa051a65f [lin_tape] #8 [ffff8809503c7bf0] tape_send_scsi_cmd at ffffffffa051a97b [lin_tape] #9 [ffff8809503c7c50] tape_set_pos at ffffffffa051fd8f [lin_tape] #10 [ffff8809503c7cc0] stiocsetpos at ffffffffa0503ae6 [lin_tape] #11 [ffff8809503c7cd0] drvioc_exe at ffffffffa0501214 [lin_tape] #12 [ffff8809503c7d40] lin_tape_drive_ioctl at ffffffffa050e5fd [lin_tape] #13 [ffff8809503c7dd0] lin_tape_ioctl_drive at ffffffffa04f8339 [lin_tape] #14 [ffff8809503c7e20] lin_tape_ioctl at ffffffffa04f84cd [lin_tape] #15 [ffff8809503c7e60] vfs_ioctl at ffffffff811a3a9c #16 [ffff8809503c7ea0] do_vfs_ioctl at ffffffff811a3be4 #17 [ffff8809503c7f30] sys_ioctl at ffffffff811a4161 #18 [ffff8809503c7f80] system_call_fastpath at ffffffff8100b072 RIP: 00000036dbce08c7 RSP: 00007ffdcd804bc0 RFLAGS: 00010202 RAX: 0000000000000010 RBX: ffffffff8100b072 RCX: 0000000000000afd RDX: 00007ffdcd805c10 RSI: ffffffffc04c7a34 RDI: 0000000000000057 RBP: 00000000c04c7a34 R8: 0000000000000000 R9: 00007fff88062028 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d9c4e2 R13: 00007ffdc2794a38 R14: 00007ffdcd805c10 R15: 00007ffdc2794a38 ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b crash> -
The hung task (khungtaskd) got triggered as dsmserv was in
TASK_UNINTERRUPTIBLEstate for 8.5 minutes which was waiting for I/O request to finish. Check what was the time out of the request being processed:crash> whatis blk_execute_rq int blk_execute_rq(struct request_queue *, struct gendisk *, struct request *, int); -
The request is passed as 3rd parameter in
RDXcrash> struct -o request.timeout struct request { [0x138] unsigned int timeout; } -
The time out is on offset
0x138instruct request -
Reference to stuct request copied from
R12toRDXcrash> dis -r ffffffffa0514d98 | tail -10 0xffffffffa0514d65 <lin_tape_blk_rq+261>: mov %r12,%rdx 0xffffffffa0514d68 <lin_tape_blk_rq+264>: mov 0x38(%r12),%rdi 0xffffffffa0514d6d <lin_tape_blk_rq+269>: xor %esi,%esi 0xffffffffa0514d6f <lin_tape_blk_rq+271>: mov $0x1,%ecx 0xffffffffa0514d74 <lin_tape_blk_rq+276>: mov %rax,0x118(%r12) 0xffffffffa0514d7c <lin_tape_blk_rq+284>: mov 0x10(%rbp),%eax 0xffffffffa0514d7f <lin_tape_blk_rq+287>: movl $0x0,0x13c(%r12) 0xffffffffa0514d8b <lin_tape_blk_rq+299>: mov %eax,0x138(%r12) 0xffffffffa0514d93 <lin_tape_blk_rq+307>: callq 0xffffffff81277890 <blk_execute_rq> crash> dis -rl blk_execute_rq+20 /usr/src/debug/kernel-2.6.32-504.16.2.el6/linux-2.6.32-504.16.2.el6.x86_64/block/blk-exec.c: 93 0xffffffff81277890 <blk_execute_rq>: push %rbp 0xffffffff81277891 <blk_execute_rq+1>: mov %rsp,%rbp 0xffffffff81277894 <blk_execute_rq+4>: push %r12 0xffffffff81277896 <blk_execute_rq+6>: push %rbx 0xffffffff81277897 <blk_execute_rq+7>: sub $0x90,%rsp 0xffffffff8127789e <blk_execute_rq+14>: nopl 0x0(%rax,%rax,1) 0xffffffff812778a3 <blk_execute_rq+19>: mov %rdx,%rbx crash> bt -f 3563 #4 [ffff8809503c7940] blk_execute_rq at ffffffff8127791c ffff8809503c7948: 0000000000000000 0000000000010001 ffff8809503c7958: ffff8809503c78e0 ffff8809503c78e0 ffff8809503c7968: ffff8809503c7988 ffffffff812714d0 ffff8809503c7978: ffff880870a584c0 ffff88086ead6d28 ffff8809503c7988: ffff8809503c79b8 ffff88086ead6d28 ffff8809503c7998: 0000000000000000 00000000000000d0 ffff8809503c79a8: 0000000000000000 0000000000000000 ffff8809503c79b8: ffff8809503c79e8 ffffffff812720e6 ffff8809503c79c8: ffff880870a584c0 0000000076ac13f1 ffff8809503c79d8: 0000000000000000 ffff88086eb83320 ^ struct request ffff8809503c79e8: ffff8809503c7a88 ffffffffa0514d98 crash> struct request.timeout ffff88086eb83320 timeout = 1440000 crash>
The time out is set to 1440000ms (24 minutes) by IBM lin_tape driver. As the time out is set higher than default kernel.hung_task_panic threshold (120 seconds), waiting for the I/O resulted in triggering the crash.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.