Implications of KASLR on vmcore analysis with 'crash'

Updated

Introduction

Starting on 7.5, RHEL kernels will feature KASLR (Kernel Address Space Linear Randomization) enabled by default. KASLR is a security feature that enables the kernel to relocate itself to a random location on each boot, making writing exploits that depend on local resources significantly harder.

As a side effect, debugging tools like crash may encounter some trouble trying to open vmcores from KASLR-enabled kernels.

Understanding KASLR impact on vmcore analysis


Prior to KASLR introduction, the kernel would usually be located at well known physical and virtual addresses. Thanks to this, vmcore analysis tools like `crash` were able, if needed, to locate for specific data, such as the `linux_banner`, at specific offsets. Additionally, symbol tables found in the kernel's debugging information packages, which link symbols with virtual addresses, could be used directly to look for the actual data structures contained in a vmcore.

But on KASLR-enabled kernels, both the physical location of the kernel in the computer's RAM and the virtual address base offset change between boots. This means that data like the linux_banner are no longer located at well-known offsets and objects pointed by symbols in the kernel's debugging information won't be found at the expected virtual addresses.

The introduction of vmcoreinfo


To overcome the difficulties in core dump analysis introduced by KASLR, a specialized data section was added to the vmcores, named `vmcoreinfo`. This section is written by the `crashkernel` while collecting the dump, and contains, among other information, the kernel's physical base (the physical offset where the kernel was relocated on boot) and the KASLR offset (the difference between the original base virtual address and the base virtual address after the relocation).

This information allows utilities like crash to calculate the offsets needed to find both data located at well-known physical addresses and symbols from the kernel's debugging information.

Determining if crash has found the vmcoreinfo section


When `crash` finds a `vmcoreinfo` section, it prints a message that debugging symbols are being patched to accommodate the fact that the kernel has been relocated. This message looks like this, with different *XXX* and *YYY* values:
 WARNING: kernel relocated [XXXMB]: patching YYY gdb minimal_symbol values

On the other hand, if crash tries to open a vmcore from a KASLR-enabled kernel without vmcoreinfo, and it's a version prior to the introduction of KASLR offset calculation (see Analyzing VM dumps without vmcoreinfo), or the dump is missing the state of the vCPU registers, it may print a message like the following one and/or fail to locate some symbol(s):

WARNING: cannot determine physical base address: defaulting to 0

Finally, the contents of the vmcoreinfo section can be dumped using the help -D command of crash:

crash> help -D
diskdump_data: 
          filename: vmcore
             flags: c6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED|LZO_SUPPORTED|SNAPPY_SUPPORTED) 
               dfd: 3
               ofp: 7f0530779400
      machine_type: 62 (EM_X86_64)
(...)
  sub_header_kdump: 24e7ff0 
           phys_base: 3c800000
          dump_level: 31 (0x1f) (DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_CACHE|DUMP_EXCLUDE_CACHE_PRI|DUMP_EXCLUDE_USER_DATA|DUMP_EXCLUDE_FREE)
               split: 0
           start_pfn: (unused)
             end_pfn: (unused)
   offset_vmcoreinfo: 4936 (0x1348)
     size_vmcoreinfo: 1767 (0x6e7)
                      OSRELEASE=3.10.0-830.el7.x86_64
                      PAGESIZE=4096
                      SYMBOL(init_uts_ns)=ffffffff98a16280
                      SYMBOL(node_online_map)=ffffffff98b439c0
                      SYMBOL(swapper_pg_dir)=ffffffff98a0e000
                      SYMBOL(_stext)=ffffffff97e00000
                      SYMBOL(vmap_area_list)=ffffffff98a91050
                      SYMBOL(mem_section)=ffffffff98fd8580
                      LENGTH(mem_section)=4096
                      SIZE(mem_section)=32
                      OFFSET(mem_section.section_mem_map)=0
                      SIZE(page)=64
                      SIZE(pglist_data)=157056
                      SIZE(zone)=2048
                      SIZE(free_area)=104
                      SIZE(list_head)=16
                      SIZE(nodemask_t)=128
                      OFFSET(page.flags)=0
                      OFFSET(page._count)=28
                      OFFSET(page.mapping)=8
                      OFFSET(page.lru)=32
                      OFFSET(page._mapcount)=24
                      OFFSET(page.private)=48
                      OFFSET(pglist_data.node_zones)=0
                      OFFSET(pglist_data.nr_zones)=156736
                      OFFSET(pglist_data.node_start_pfn)=156744
                      OFFSET(pglist_data.node_spanned_pages)=156760
                      OFFSET(pglist_data.node_id)=156768
                      OFFSET(zone.free_area)=144
                      OFFSET(zone.vm_stat)=1496
                      OFFSET(zone.spanned_pages)=1896
                      OFFSET(free_area.free_list)=0
                      OFFSET(list_head.next)=0
                      OFFSET(list_head.prev)=8
                      OFFSET(vmap_area.va_start)=0
                      OFFSET(vmap_area.list)=48
                      LENGTH(zone.free_area)=11
                      SYMBOL(log_buf)=ffffffff98a436e0
                      SYMBOL(log_buf_len)=ffffffff98a436dc
                      SYMBOL(log_first_idx)=ffffffff98ebd8e8
                      SYMBOL(log_next_idx)=ffffffff98ebd8d8
                      SIZE(log)=16
                      OFFSET(log.ts_nsec)=0
                      OFFSET(log.len)=8
                      OFFSET(log.text_len)=10
                      OFFSET(log.dict_len)=12
                      LENGTH(free_area.free_list)=6
                      NUMBER(NR_FREE_PAGES)=0
                      NUMBER(PG_lru)=5
                      NUMBER(PG_private)=11
                      NUMBER(PG_swapcache)=16
                      NUMBER(PG_slab)=7
                      NUMBER(PG_hwpoison)=23
                      NUMBER(PG_head_mask)=16384
                      NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-128
                      SYMBOL(free_huge_page)=ffffffff97fe0510
                      NUMBER(phys_base)=1015021568
                      SYMBOL(init_level4_pgt)=ffffffff98a0e000
                      SYMBOL(node_data)=ffffffff98b3e6c0
                      LENGTH(node_data)=1024
                      KERNELOFFSET=16e00000
                      NUMBER(KERNEL_IMAGE_SIZE)=1073741824
                      CRASHTIME=1523005087

KASLR and VM dumps


As described above, the `vmcoreinfo` data section was introduced to help vmcore analysis tools to locate physical objects in the dump and properly resolve debugging symbols. As this section is written by the crashkernel when a panic is triggered in the system, by default `vmcoreinfo` will not be present in vmcores extracted from VM dumps, because the `crashkernel` is not involved in this scenario.

QEMU's vmcoreinfo device


Upcoming QEMU versions will support a specialized device also named `vmcoreinfo`, which is presented to the VM. If the Guest's kernel has successfully written the section to this device, QEMU will automatically include it in the VM dump. This device is not enabled by default and must be explicitly specified in the list of arguments.

On the other hand, libVirt will also support this device in its domain definition format, but it will not enable it by default either. This implies that either the user or the management software of a layered product must explicitly add it to the domain's definition for the device to be actually present in the VM.

This section will be updated when the aforementioned versions are publicly available.

Analyzing VM dumps without vmcoreinfo


Recently, `crash` has gained the ability to calculate both the physical base and the KASLR offset from a vmcore, even if the vmcoreinfo section is missing. This is being done using a technique developed by Takao Indoh (Fujitsu), and was introduced with the following commits:

Please note: These are not yet available in the current downstream version of crash, specifically crash-7.2.0-6.el7.x86_64.

For this technique to work, the VM dump must include the state of the vCPU registers at the moment of taking the dump. The following dump formats are known to work:

  • QEMU netdump/diskdump (both ELF and compressed formats)

    • Example: virsh dump --memory-only <domain_name>
  • VMware VMSS (including snapshots with separated vmem files)

And these are known not to work:

  • vmcores extracted from QEMU core dumps

    • Example: gcore [-a] $PID
    • QEMU's core dump doesn't include the state of the vCPU registers.
    • If you want to collect both QEMU and Guest states, please consider running gcore first (without '-a', we just want QEMU's internal mappings) and then virsh dump --memory-only. The order is important, as virsh dump will send an IPI to the vCPUs, potentially altering their respective states.
  • VMware VMSS files converted to vmcore using vmss2core

    • vCPU Control Registers (CR) are not included in the conversion

Disabling KASLR on the Guest


If there's no option to use any of the supported VM dumps formats, and a panic can't be generated from inside the VM either, there's still the option of disabling KASLR from inside the Guest to fallback to the pre-7.5 behavior.

This can be done by adding the nokaslr option to the kernel's command line. The recommended way to do this is by editing /etc/sysconfig/grub and regenerating /boot/grub2/grub.cfg:

  • Original /etc/sysconfig/grub (KASLR-enabled)
# cat /etc/sysconfig/grub 
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
  • Modified /etc/sysconfig/grub (KASLR-disabled)
# cat /etc/sysconfig/grub 
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet nokaslr"
GRUB_DISABLE_RECOVERY="true"
  • Regenerating /boot/grub2/grub.cfg
# grub2-mkconfig > /boot/grub2/grub.cfg 
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-830.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-830.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-91d6af6286ab4d3dbc2fde5fe96ac63c
Found initrd image: /boot/initramfs-0-rescue-91d6af6286ab4d3dbc2fde5fe96ac63c.img
done
Category
Components
Article Type