How to properly size and position the crashkernel in RHEL 5?

Solution Unverified - Updated

Environment

  • Red Hat Enterprise Linux 5
  • kdump from kexec-tools to capture vmcore
  • System with large amount of memory

Issue

  • Please provide instructions for properly calculating crashkernel= setting for a system with a large amount of RAM.
  • service kdump start fails with crashkernel reservation failed
  • Booting Red Hat Enterprise Linux 5 with the crashkernel=X@Y parameter enabled for the kdump kernel does not always succeed.

Resolution

Boot the system with no crashkernel setting and focus on the System RAM memory mapping from /proc/iomem.

You may use the following command to identify potential areas where the crashkernel reservation can be placed:

# awk '/^ .*|System RAM/{sub(/-/, " ");sub(/ : /, " ");printf("%d bytes to %d bytes (%d M to %d M) - %s %s\n", strtonum("0x"$1), strtonum("0x"$2), strtonum("0x"$1)/1024/1024, strtonum("0x"$2)/1024/1024, $3, $4)}' /proc/iomem

Output will be different for every system. Follow the process described in the Diagnostic Steps section for your system to find the appropriate crashkernel allocation.

Use the parameter crashkernel=size@offset to ensure the crashkernel allocation does not overlap any other memory areas.

Once the setting has been calculated, reboot the system to verify the setting, and ensure that capturing a vmcore is tested and confirmed to work before returning the system to service:

How should the crashkernel parameter be configured for using kdump on RHEL6?
How should the crashkernel parameter be configured for using kdump on RHEL7?
How to use the SysRq facility to collect information from a server which has hung

Root Cause

The crashkernel allocation fails because the kernel is not able to find a suitable memory range for the crashkernel.

This may be due to usage of memory by other memory allocations, such as the kernel itself or other PCI-memory-mapped devices, or due to areas which are reserved by the system BIOS.

The crashkernel parameter is formatted as crashkernel=size@offset. The size parameter defines the amount of memory to be reserved for the kexec kernel. The offset parameter defines the start point of the kexec memory allocation, and cannot overlap any other memory allocations.

For example, crashkernel=512M@16M reserves a block of 512 megabytes for the kexec kernel, and this allocation starts 16 megabytes from the beginning of memory allocation. The memory allocation in this example would be from 16 megabytes to 528 megabytes.

Diagnostic Steps

Review memory allocations

Inspect /proc/iomem before setting crashkernel setting:

# cat /proc/iomem
00010000-0009d3ff : System RAM
...
00100000-bdda9fff : System RAM
  00200000-0048e067 : Kernel code
  0048e068-00630167 : Kernel data
...
100000000-1003fffefff : System RAM

The crashkernel needs to be loaded into one of these "System RAM" areas.

Converting the numbers to decimal makes it easier to see which can be used:

# egrep "^ .*|System RAM" /proc/iomem | sed -e 's/ : / /g' -e 's/-/ /g' | awk '{;printf("%d bytes to %d bytes (%d M to %d M) - %s %s\n", strtonum("0x"$1), strtonum("0x"$2), strtonum("0x"$1)/1024/1024, strtonum("0x"$2)/1024/1024, $3, $4)}'

65536 bytes to 644095 bytes (0 M to 0 M) - System RAM
1048576 bytes to 3185221631 bytes (1 M to 3037 M) - System RAM
2097152 bytes to 4776039 bytes (2 M to 4 M) - Kernel code
4776040 bytes to 6488423 bytes (4 M to 6 M) - Kernel data
4294967296 bytes to 1100585365503 bytes (4096 M to 1049599 M) - System RAM
...

Based on this output, we can see potential crashkernel allocation areas.

Total system RAM

Inspect the total amount of system memory:

# grep MemTotal /proc/meminfo 
MemTotal:     1055799440 kB

As a general guide, the crashkernel size first parameter should be sized to:

ram sizecrashkernel parameter
>0GB128M
>2GB256M
>6GB512M
>8GB768M

For this system with 1,055,799,440 kB (1 Terabyte) of memory, the crashkernel reservation reservation should be 768M.

Calculate result

Based on this number, plus what we have observed from /proc/iomem, we can now safely calculate a crashkernel setting.

768M fits within the zone described by:

1048576 bytes to 3185221631 bytes (1 M to 3037 M) - System RAM

However the offset must not overlap these other areas:

2097152 bytes to 4776039 bytes (2 M to 4 M) - Kernel code
4776040 bytes to 6488423 bytes (4 M to 6 M) - Kernel data

So on this system, the offset can start anywhere from 6M upwards, and must not go over 3037M.

More examples

65536 bytes to 646143 bytes (0 M to 0 M) - System RAM
1048576 bytes to 1073729535 bytes (1 M to 1023 M) - System RAM
2097152 bytes to 4790255 bytes (2 M to 4 M) - Kernel code
4790256 bytes to 6510055 bytes (4 M to 6 M) - Kernel data

In the above example, crashkernel must start after 6M, and must not go past 1023M.

4096 bytes to 645119 bytes (0 M to 0 M) - System RAM
983040 bytes to 1048575 bytes (0 M to 0 M) - System ROM
1048576 bytes to 3667521535 bytes (1 M to 3497 M) - System RAM
16777216 bytes to 22231940 bytes (16 M to 21 M) - Kernel code
22231941 bytes to 29412495 bytes (21 M to 28 M) - Kernel data
30773248 bytes to 33688227 bytes (29 M to 32 M) - Kernel bss
4294967296 bytes to 17689477119 bytes (4096 M to 16869 M) - System RAM

In the above example, crashkernel must start after 32M, and must not go past 3497M.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.