How to trigger SysRq/NMI through a DELL DRAC in order to troubleshoot a hung system?
Environment
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 8
- Red Hat Enterprise Linux 9
Issue
- Interacting with a Dell system through SysRq/NMI signals sent via DRAC is required to perform several troubleshooting tasks on a hung system.
Resolution
Trigger an NMI from the iDRAC remote console
-
Make sure that the system is configured to be panicked by an NMI:
How can I configure my system to crash when NMI switch is pushed? -
Send an NMI from the iDRAC web interface, using the NMI button, as described here:
iDRAC 7/8: Content from www.dell.com is not included.iDRAC 8/7 v2.30.30.30 User’s Guide
iDRAC 9: Content from www.dell.com is not included.Integrated Dell Remote Access Controller 9 User's Guide
Remember that this will immediately panic the system if the NMI tuneables are set.
The above hyperlinks are not managed by Red Hat.
Trigger an NMI from another RHEL system using serial over LAN
-
Enable the Serial Over LAN from iDRAC if not already enabled.
-
Install the
ipmitoolpackage on a separate RHEL system that has access to the DRAC remote management system's ip address.
# yum -y install ipmitool
- On that separate RHEL system, execute the following
ipmitoolcommand to send the NMI signal to the affected server via serial over LAN interface. This will immediately panic the system if the NMI tuneables are set:
# ipmitool -H test.example.com -I lanplus -U <user> -P <password> chassis power diag
- Replace test.example.com with iDRAC domain/IP.
Trigger an SysRq from another RHEL system using serial over LAN
-
Enable the Serial Over LAN from iDRAC if not already enabled.
-
Install the
ipmitoolpackage on a separate RHEL system as shown in the above section.
- Using this separate RHEL system as the starting point, execute the following
ipmitoolcommand to access the management system of the affected server via serial over LAN interface
# ipmitool -H test.example.com -I lanplus -U <user> -P <password> sol activate
- Press the following keys in quick succession
[shift]+[~] then [shift]+[b]
- The following output should be visible:
~B [send break]
- Enter 'm' key to force a memory dump.
SysRq : Show Memory
Node 0 DMA per-cpu:
cpu 0 hot: high 0, batch 1 used:0
cpu 0 cold: high 0, batch 1 used:0
cpu 1 hot: high 0, batch 1 used:0
cpu 1 cold: high 0, batch 1 used:0
cpu 2 hot: high 0, batch 1 used:0
cpu 2 cold: high 0, batch 1 used:0
cpu 3 hot: high 0, batch 1 used:0
cpu 3 cold: high 0, batch 1 used:0
cpu 4 hot: high 0, batch 1 used:0
cpu 4 cold: high 0, batch 1 used:0
...
- This should confirm that
ipmitoolis successfully triggering sysrq signals over the DRAC remote management system of the affected server. In order to crash the server, first repeat the keystrokes that allowed sending a remote sysrq signal:
[shift]+[~] then [shift]+[b]
- [CAUTION: The following
cause a kernel panic on the affected system - Which should be the intention] Press 'c' to send a "force crash" sysrq signal
Pre-requisites
DELL DRAC has to be configured correctly
[How do I set up a serial terminal and/or console in Red Hat Enterprise Linux?](https://access.redhat.com/site/node/7212)
[How do I troubleshoot kernel crashes, hangs, or reboots with kdump on Red Hat Enterprise Linux?](https://access.redhat.com/site/solutions/6038)
[How can I use the SysRq facility to collect information from a server which has hung?](https://access.redhat.com/site/solutions/2023)
[How can I configure my system to crash when NMI switch is pushed?](https://access.redhat.com/solutions/125103)
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.