RHEL7: "leapp upgrade" fails in reboot phase with message "BLS file /boot/loader/entries/XXX.conf already exists" then system reboots automatically

Solution Verified - Updated 13 Jun 2024

Environment

Red Hat Enterprise Linux (RHEL) 7
- leapp

Issue

After rebooting once leapp upgrade executed successfully, the system reboots after printing the following messages on the console:

upgrade[XXX]: ============================================================
upgrade[XXX]:                            ERRORS
upgrade[XXX]: ============================================================
upgrade[XXX]: <DATE AND TIME> [ERROR] Actor: zipl_convert_to_blscfg
upgrade[XXX]: Message: zipl-switch-to-blscfg execution failed with non zero exit code.
upgrade[XXX]: Summary:
upgrade[XXX]:     Details: Command ['systemd-nspawn', '--register=no', '--quiet', '-D', u'/var/lib/leapp/el8userspace', '--bind=/etc/hosts:/etc/hosts', '--setenv=LEAPP_NO_RHSM=0', '--setenv=LEAPP_EXPERIMENTAL=0', '--setenv=LEAPP_COMMON_TOOLS=:/etc/leapp/repos.d/system_upgrade/el7toel8/tools', '--setenv=LEAPP_COMMON_FILES=:/etc/leapp/repos.d/system_upgrade/el7toel8/files', '--setenv=LEAPP_ENABLE_REPOS=rhel-8-for-s390x-baseos-rpms,rhel-8-for-s390x-appstream-rpms', '--setenv=LEAPP_UNSUPPORTED=0', '--setenv=LEAPP_EXECUTION_ID=...', '--setenv=LEAPP_HOSTNAME=...', '/usr/sbin/zipl-switch-to-blscfg'] failed with exit code 1.
upgrade[XXX]:     Stderr: Host and machine ids are equal (<MACHINE_ID>): refusing to link journals
upgrade[XXX]:             BLS file /boot/loader/entries/<MACHINE_ID>-XXX.conf already exists
upgrade[XXX]:     Stdout:
upgrade[XXX]: ============================================================
upgrade[XXX]:                        END OF ERRORS
upgrade[XXX]: ============================================================

Resolution

Follow the procedure in the Diagnostic Steps section.
If this is a match, proceed further, otherwise contact your Red Hat Support representative and mention this Solution.

Scenario 1 - having entries for different machine ids

Collect the machine id of the system

# cat /etc/machine-id
7fef08a17f6a400db03b693a0ef30ba0

In case the system failed to upgrade during reboot phase, delete the existing /boot/loader directory
```
# rm -fr /boot/loader
```

Edit /etc/zipl.conf to delete the entries not matching the system's machine id

# vim /etc/zipl.conf
.... editor opens ....

[defaultboot]
defaultauto
prompt=1
timeout=5
default=Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0
target=/boot
[Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0]
    image=/boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    ramdisk=/boot/initramfs-0-rescue-7fef08a17f6a400db03b693a0ef30ba0.img
[3.10.0-1160.25.1.el7.s390x]
    image=/boot/vmlinuz-3.10.0-1160.25.1.el7.s390x
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    ramdisk=/boot/initramfs-3.10.0-1160.25.1.el7.s390x.img
[linux-0-rescue-fbf2f10617024e97989bccd4d299ec21]
    image=/boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
    ramdisk=/boot/initramfs-0-rescue-fbf2f10617024e97989bccd4d299ec21.img
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"

In the example above, the entire section [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21] has to be deleted since it doesn't match the machine id of the system (7fef08a17f6a400db03b693a0ef30ba0).

In case the system failed to upgrade during reboot phase, execute the upgrade command again

This is necessary because the reboot phase deleted the temporary zipl entry to upgrade the system.
```
# leapp upgrade ...
```
Reboot the system to complete the upgrade
```
# reboot
```

Scenario 2 - having multiple entries booting the same kernel

In case the system failed to upgrade during reboot phase, delete the existing /boot/loader directory
```
# rm -fr /boot/loader
```

Edit /etc/zipl.conf to delete the additional entries running the same kernel

# vim /etc/zipl.conf
.... editor opens ....

[defaultboot]
defaultauto
prompt=1
timeout=5
default=3.10.0-1160.36.2.el7.s390x
target=/boot
[3.10.0-1160.36.2.el7.s390x]
        image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
        parameters="root=/dev/mapper/rootvg-root crashkernel=auto cio_ignore=all,!condev rd.lvm.lv=rootvg/root rd.dasd=0.0.0100 LANG=en_US.UTF-8"
        ramdisk=/boot/initramfs-3.10.0-1160.36.2.el7.s390x.img
[3.10.0-1160.36.2.el7.s390x_with_debugging]
        image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
        parameters="root=/dev/mapper/rootvg-root crashkernel=auto cio_ignore=all,!condev rd.lvm.lv=rootvg/root rd.dasd=0.0.0100 LANG=en_US.UTF-8 systemd.log_level=debug systemd.log_target=kmsg"
        ramdisk=/boot/initramfs-3.10.0-1160.36.2.el7.s390x.img

In the example above, the entire section [3.10.0-1160.36.2.el7.s390x_with_debugging] has to be deleted.

In case the system failed to upgrade during reboot phase, execute the upgrade command again

This is necessary because the reboot phase deleted the temporary zipl entry to upgrade the system.
```
# leapp upgrade ...
```
Reboot the system to complete the upgrade
```
# reboot
```

Root Cause

The /usr/sbin/zipl-switch-to-blscfg creates BLS entries early during reboot phase (files in /boot/loader/entries).
For the rescue kernel image, a /boot/loader/entries/<MACHINE_ID>-0-rescue.conf file is created
In case multiple rescue kernel images are found, since /boot/loader/entries/<MACHINE_ID>-0-rescue.conf already exists, the utility exits in error

Another similar scenario is when having the same kernel used in multiple entries, for example for booting normally and booting with some debugging options.

The issue is tracked by This content is not included.BZ 1983051 - zipl-switch-to-blscfg dies with "entry already exists" when having more than one "rescue" entry.

Diagnostic Steps

Scenario 1 - having entries for different machine ids

Collect the machine id of the system

# cat /etc/machine-id
7fef08a17f6a400db03b693a0ef30ba0

Dump the content of /etc/zipl.conf to verify that there is only one rescue entry

[defaultboot]
defaultauto
prompt=1
timeout=5
default=Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0
target=/boot
[Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0]
    image=/boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    ramdisk=/boot/initramfs-0-rescue-7fef08a17f6a400db03b693a0ef30ba0.img
[3.10.0-1160.25.1.el7.s390x]
    image=/boot/vmlinuz-3.10.0-1160.25.1.el7.s390x
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root LANG=en_US.UTF-8 ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"
    ramdisk=/boot/initramfs-3.10.0-1160.25.1.el7.s390x.img
[linux-0-rescue-fbf2f10617024e97989bccd4d299ec21]
    image=/boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
    ramdisk=/boot/initramfs-0-rescue-fbf2f10617024e97989bccd4d299ec21.img
    parameters="root=/dev/mapper/rootvg-root vmalloc=4096G user_mode=home console=ttyS0 crashkernel=auto rd.lvm.lv=rootvg/root ipv6.disable=1 transparent_hugepage=never vmhalt=LOGOFF vmpoff=LOGOFF"

In the example above, there are:

2 rescue entries
- section [Red_Hat_Enterprise_Linux_Server_7.9_Rescue_7fef08a17f6a400db03b693a0ef30ba0] with kernel image /boot/vmlinuz-0-rescue-7fef08a17f6a400db03b693a0ef30ba0
- section [linux-0-rescue-fbf2f10617024e97989bccd4d299ec21] with kernel image /boot/vmlinuz-0-rescue-fbf2f10617024e97989bccd4d299ec21
one of the entries is not matching the machine id of the system ([linux-0-rescue-fbf2f10617024e97989bccd4d299ec21])

If there are more than one rescue entry, you will face the issue on reboot.

Scenario 2 - having multiple entries booting the same kernel

Execute the following grep command to verify that you don't have entries executing the same kernel but with different arguments
```
# grep -oP "image=(.*)" /etc/zipl.conf | sort | uniq -c | sort -k1 -nr
      2 image=/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x
      1 image=/boot/vmlinuz-3.10.0-1160.49.1.el7.s390x
```
In the example above, grep found out that 2 entries use the same kernel (/boot/vmlinuz-3.10.0-1160.36.2.el7.s390x), which will end up facing the issue on reboot as well.

SBR

Anaconda

Product(s)

Red Hat Enterprise Linux for IBM Z and LinuxONE

Components

leapp

Category

Troubleshoot

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.