Configure disk access retries when booting a VM on IBM POWER
Environment
- RHEL 8.7 and later
- RHEL 9.1 and later
lparVM on IBM POWER
Issue
The GRUB boot loader retries accessing disks 20 times when disk access fails at boot. This causes problems if you perform a Live Partition Mobility (LPM) migration on an logical partition (lpar) virtual machine (VM) that connects to slow Storage Area Network (SAN) disks. As a consequence, the boot might take very long on the VM until the 20 retries finish.
Resolution
You can now configure how many times the GRUB boot loader retries accessing a remote disk when an lpar VM boots on the IBM POWER architecture. Lowering the number of retries can prevent the slow boot.
Use the ofdisk_retries GRUB option:
-
Set the number of disk access retries:
# grub2-editenv /boot/grub2/grubenv set ofdisk_retries=NUMBERReplace
NUMBERwith the maximum number of retries. Red Hat recommends that you decrease the retries to 2 if you encounter the described problem. -
Reboot the VM.
As a result, the lpar boot is no longer slow after LPM on POWER, and the lpar system boots without the failed disks.
Root Cause
Red Hat introduced the Retry on Fail GRUB feature to work around unreliable disk access at boot. However, the number of retries was hard-coded to 20.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.