How to use the SysRq Facility to collect information from a RHEL system

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux

Issue

  • How to use the SysRq-Facility to collect information from a system which is frozen or unresponsive?
  • How to manually force a crash in system hang conditions?
  • If the system goes into a hung state, can a vmcore or core dumps still be captured?
  • How do I enable SysRq to force a kernel panic?

Resolution

What is the "Magic" SysRq key?

When enabled, it is a special key combination that allows the user to force a system’s kernel to respond to specific commands, even when it is mostly unresponsive or appears to be frozen.

Notes:

  • The SysRq Facility is meant to be used for debugging or troubleshooting purposes, and is not recommended to be used for general system monitoring.
  • Before using the SysRq Facility, please consult with your vendors as third party applications may be impacted.
  • When the SysRq Facility is enabled, any user with access to the physical console will gain extra abilities. Therefore, it is recommended to disable the facility when not troubleshooting a problem or to ensure that physical console access is properly secured.

How do I enable and disable the SysRq key?

  • For security reasons, Red Hat Enterprise Linux disables the SysRq key by default. To enable it, enter the following command:
    # echo 1 > /proc/sys/kernel/sysrq

  • To disable it, enter the following command:
    # echo 0 > /proc/sys/kernel/sysrq

  • To enable it permanently, set the value of kernel.sysrqin the /etc/sysctl.conf file to 1, as shown in the example:
    # grep sysrq /etc/sysctl.conf
    kernel.sysrq = 1

  • To make this change live and persistent, run:
    # sysctl -p

How do I trigger a SysRq event?

When I trigger a SysRq event that generates an output, where does it go?

When a SysRq command is triggered, the kernel will print out the information to the kernel ring buffer and to the system console. This information can be output by issuing the dmesg command, and is usually logged to the /var/log/messages and /var/log/dmesg files.

For example:

 # echo 1 > /proc/sys/kernel/sysrq

 # echo h > /proc/sysrq-trigger 
 # dmesg | tail -n1
[  171.748959] sysrq: HELP : loglevel(0-9) reboot(b) crash(c) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) force-fb(v) show-blocked-tasks(w) dump-ftrace-buffer(z)

 # echo p > /proc/sysrq-trigger
 # tail -n8 /var/log/messages
Jul 31 17:38:35 rhel9 kernel: sysrq: Show Regs
Jul 31 17:38:35 rhel9 kernel: CPU#3: active:     0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC0 ctrl:  0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC0 count: 0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC0 left:  0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC1 ctrl:  0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC1 count: 0000000000000000
Jul 31 17:38:35 rhel9 kernel: CPU#3:   gen-PMC1 left:  0000000000000000

Unfortunately, when dealing with machines that are extremely unresponsive, the syslogd service is often unable to log these events. In these situations, provisioning a serial console is often recommended for collecting the data.

  • Note: When choosing to set up a serial console, make sure the proper printk log level is configured.

What sort of SysRq events can be triggered?

There are several SysRq events that can be triggered once the facility is enabled. These vary somewhat between kernel versions, but there are a few that are commonly used:

  • h - Print the help message to show all available options
  • m - dump information about memory allocation
  • t - dump thread state information
  • p - dump current CPU registers and flags
  • c - intentionally crash the system (useful for forcing a disk or netdump)
  • s - immediately sync all mounted filesystems
  • u - immediately remount all filesystems read-only
  • b - immediately reboot the machine
  • o - immediately power off the machine (if configured and supported)
  • f - start the Out Of Memory Killer (OOM)
  • w - dumps tasks that are in Uninterruptible-Sleep (“blocked”) state [Introduced in kernel 2.6.32]
SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.