RHEL7: "leapp upgrade" fails in reboot phase with message "BLS file /boot/loader/entries/XXX.conf already exists" then system reboots automatically

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL) 7
    • leapp

Issue

  • After rebooting once leapp upgrade executed successfully, the system reboots after printing the following messages on the console:

    upgrade[XXX]: ============================================================
    upgrade[XXX]:                            ERRORS
    upgrade[XXX]: ============================================================
    upgrade[XXX]: <DATE AND TIME> [ERROR] Actor: zipl_convert_to_blscfg
    upgrade[XXX]: Message: zipl-switch-to-blscfg execution failed with non zero exit code.
    upgrade[XXX]: Summary:
    upgrade[XXX]:     Details: Command ['systemd-nspawn', '--register=no', '--quiet', '-D', u'/var/lib/leapp/el8userspace', '--bind=/etc/hosts:/etc/hosts', '--setenv=LEAPP_NO_RHSM=0', '--setenv=LEAPP_EXPERIMENTAL=0', '--setenv=LEAPP_COMMON_TOOLS=:/etc/leapp/repos.d/system_upgrade/el7toel8/tools', '--setenv=LEAPP_COMMON_FILES=:/etc/leapp/repos.d/system_upgrade/el7toel8/files', '--setenv=LEAPP_ENABLE_REPOS=rhel-8-for-s390x-baseos-rpms,rhel-8-for-s390x-appstream-rpms', '--setenv=LEAPP_UNSUPPORTED=0', '--setenv=LEAPP_EXECUTION_ID=...', '--setenv=LEAPP_HOSTNAME=...', '/usr/sbin/zipl-switch-to-blscfg'] failed with exit code 1.
    upgrade[XXX]:     Stderr: Host and machine ids are equal (<MACHINE_ID>): refusing to link journals
    upgrade[XXX]:             BLS file /boot/loader/entries/<MACHINE_ID>-XXX.conf already exists
    upgrade[XXX]:     Stdout:
    upgrade[XXX]: ============================================================
    upgrade[XXX]:                        END OF ERRORS
    upgrade[XXX]: ============================================================
    

Resolution

Follow the procedure in the Diagnostic Steps section.
If this is a match, proceed further, otherwise contact your Red Hat Support representative and mention this Solution.

Scenario 1 - having entries for different machine ids

  1. Collect the machine id of the system

    # cat /etc/machine-id
    7fef08a17f6a400db03b693a0ef30ba0
    
  2. In case the system failed to upgrade during reboot phase, delete the existing /boot/loader directory

    # rm -fr /boot/loader
    
  3. Edit /etc/zipl.conf to delete the entries not matching the system's machine id

    # vim /etc/zipl.conf
    .... editor opens ....
    
    [defaultboot]
    defaultauto
    prompt=1
    timeout=5
    default=Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0
    target=/boot
    [Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0]
        image=/boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
        ramdisk=/boot/initramfs-0-rescue-7fef08a17f6a400db03b693a0ef30ba0.img
    [3.10.0-1160.25.1.el7.s390x]
        image=/boot/vmlinuz-3.10.0-1160.25.1.el7.s390x
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
        ramdisk=/boot/initramfs-3.10.0-1160.25.1.el7.s390x.img
    [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21]
        image=/boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
        ramdisk=/boot/initramfs-0-rescue-fbf2f10617024e97989bccd4d299ec21.img
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    

    In the example above, the entire section [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21] has to be deleted since it doesn't match the machine id of the system (7fef08a17f6a400db03b693a0ef30ba0).

  4. In case the system failed to upgrade during reboot phase, execute the upgrade command again

    This is necessary because the reboot phase deleted the temporary zipl entry to upgrade the system.

    # leapp upgrade ...
    
  5. Reboot the system to complete the upgrade

    # reboot
    

Scenario 2 - having multiple entries booting the same kernel

  1. In case the system failed to upgrade during reboot phase, delete the existing /boot/loader directory

    # rm -fr /boot/loader
    
  2. Edit /etc/zipl.conf to delete the additional entries running the same kernel

    # vim /etc/zipl.conf
    .... editor opens ....
    
    [defaultboot]
    defaultauto
    prompt=1
    timeout=5
    default=3.10.0-1160.36.2.el7.s390x
    target=/boot
    [3.10.0-1160.36.2.el7.s390x]
            image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
            parameters="root=/dev/mapper/rootvg-root crashkernel=auto cio_ignore=all,!condev rd.lvm.lv=rootvg/root rd.dasd=0.0.0100 LANG=en_US.UTF-8"
            ramdisk=/boot/initramfs-3.10.0-1160.36.2.el7.s390x.img
    [3.10.0-1160.36.2.el7.s390x_with_debugging]
            image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
            parameters="root=/dev/mapper/rootvg-root crashkernel=auto cio_ignore=all,!condev rd.lvm.lv=rootvg/root rd.dasd=0.0.0100 LANG=en_US.UTF-8 systemd.log_level=debug systemd.log_target=kmsg"
            ramdisk=/boot/initramfs-3.10.0-1160.36.2.el7.s390x.img
    

    In the example above, the entire section [3.10.0-1160.36.2.el7.s390x_with_debugging] has to be deleted.

  3. In case the system failed to upgrade during reboot phase, execute the upgrade command again

    This is necessary because the reboot phase deleted the temporary zipl entry to upgrade the system.

    # leapp upgrade ...
    
  4. Reboot the system to complete the upgrade

    # reboot
    

Root Cause

  • The /usr/sbin/zipl-switch-to-blscfg creates BLS entries early during reboot phase (files in /boot/loader/entries).
  • For the rescue kernel image, a /boot/loader/entries/<MACHINE_ID>-0-rescue.conf file is created
  • In case multiple rescue kernel images are found, since /boot/loader/entries/<MACHINE_ID>-0-rescue.conf already exists, the utility exits in error

Another similar scenario is when having the same kernel used in multiple entries, for example for booting normally and booting with some debugging options.

The issue is tracked by This content is not included.BZ 1983051 - zipl-switch-to-blscfg dies with "entry already exists" when having more than one "rescue" entry.

Diagnostic Steps

Scenario 1 - having entries for different machine ids

  1. Collect the machine id of the system

    # cat /etc/machine-id
    7fef08a17f6a400db03b693a0ef30ba0
    
  2. Dump the content of /etc/zipl.conf to verify that there is only one rescue entry

    [defaultboot]
    defaultauto
    prompt=1
    timeout=5
    default=Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0
    target=/boot
    [Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0]
        image=/boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
        ramdisk=/boot/initramfs-0-rescue-7fef08a17f6a400db03b693a0ef30ba0.img
    [3.10.0-1160.25.1.el7.s390x]
        image=/boot/vmlinuz-3.10.0-1160.25.1.el7.s390x
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
        ramdisk=/boot/initramfs-3.10.0-1160.25.1.el7.s390x.img
    [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21]
        image=/boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
        ramdisk=/boot/initramfs-0-rescue-fbf2f10617024e97989bccd4d299ec21.img
        parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    

    In the example above, there are:

    • 2 rescue entries
      • section [Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0] with kernel image /boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
      • section [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21] with kernel image /boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
    • one of the entries is not matching the machine id of the system ([linux-0-rescue-fbf2f10617024e97989bccd4d299ec21])

If there are more than one rescue entry, you will face the issue on reboot.


Scenario 2 - having multiple entries booting the same kernel

  1. Execute the following grep command to verify that you don't have entries executing the same kernel but with different arguments

    # grep -oP "image=(.*)" /etc/zipl.conf | sort | uniq -c | sort -k1 -nr
          2 image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
          1 image=/boot/vmlinuz-3.10.0-1160.49.1.el7.s390x
    

    In the example above, grep found out that 2 entries use the same kernel (/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x), which will end up facing the issue on reboot as well.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.