Storage device names, such as /dev/sdX, are inconsistent between boots

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL)

Issue

  • The /dev/sdx name associated with SAN devices changes if the path is failed and restored back.
  • The system had assigned /dev/sdn device name to the SAN LUN, but after a reboot this name was changed to /dev/sdk, is this expected behavior?
  • I am unable to rename the /dev/dm-XX device on the server.
  • The /dev/sdx name associated with a virtual disk change assigned order upon repeated booting cycles.
  • The /dev/sdx name associated with a storage disk changed the assigned value across reboots.
  • The device names associated with a storage disk are non-persistent across boots. The same name references different physical disks on different boot up sessions.
  • On a system with multiple types of HBAs, it is possible for the device file name /dev/sdN to change across reboots. Seems to happen more often when mixing SAN storage and/or with other types of storage.
  • Using /dev/sdN names is leading to data loss (mkfs wrong drive), or information leaks (using wrong drive in virtual machines) and other critical problems.

Resolution

NOTE   While the following focuses on /dev/sdX names, the same 'assign names in "as discovered" order' applies to any scsi devices such as tapes and changers as well as other block storage devices such as /dev/nvme* devices as well. See, for instance, "How to make block device NMVe assigned names persistent?", "Setting persistent device names for tape devices", "Do virtio disks support persistent names", and others relating to persistent names for other types and classes of devices. See root cause for additional information.

DM-Multipath

LVM Logical Volumes

  • The logical volume names are persistent as the are coupled with the LVM metadata found on disk regardless of the sdx name assigned. Use the lv name, for example in /etc/fstab:

      /dev/mapper/vgname-lvname   /                       xfs     defaults        0 0
    

Kernel Supplied Persistent Names

  • Red Hat Enterprise Linux automatically creates persistent disk names within /dev/disk/by-id/ that are symbolic links to the appropriate storage device. These symbolic links are persistent across the system reboot and SCSI bus rescan, path failovers. The assigned persistent names using either device self-identifiers (World Wide Identifier or serial number for example), or on-disk data (filesystem UUIDs for example).
  • While kernel provided /dev/disk/by-id symlinks are persistent when based upon self-identifiers (WWID, serial numbers, etc.), in cases where that information is not available a semi-persistent symlink will can be created by the kernel based upon the hardware path to the device. These identifiers are persistent across boots unless there is a physical configuration change performed -- for example moving a storage controller to a different PCI slot. But most storage provides a usable self-identifier of a WWID, serial number, or other self-identifier that is permanently and uniquely bound to the disk.
  • If there is a requirement to use individual sub path of the SAN device, then it would be recommended to use /dev/disk/by-id/ symbolic link as it is persistent.

User Created Custom Names

Use scsi_mod.scan=sync

Use sd_mod.probe="sync"

  • See "RHEL9.3 :: sd_mod.probe=sync" for additional details
    • Only available in RHEL 9.3 and later 9 kernels, not accepted upstream as a feature so will be dropped in RHEL10 and later.

    • $ cat /sys/module/sd_mod/parameters/sync

    • $ modinfo sd_mod | grep sync

        parm: probe:async or sync. Setting to 'sync' disables asynchronous device number assignments (sda, sdb, ...). (string)
      
    • Unlike the prior methods above, this method cannot guarantee that devices names are more consistent across boots -- but it can help.

    • Starting in RHEL 9.3, you can add sd_mod.probe="sync" to the boot line to change the parallel lun discovery across a scsi host bus (default:async mode) to serial (sync mode). This can help make devices in small, simple storage configurations more consistent across boots at the expense of a slower boot time.

    • This feature was added in 9.3 kernel-5.14.0-362.8.1.el9_3, and later, via RHSA-2023:6583 - Security Advisory.

    • This feature will not help in larger storage configurations, such as SAN, where there are multiple scsi hosts buses present. Each bus is sill independent of other buses in terms of discovery as was present in RHEL 8, 7, 6, etc.

Additional References

Root Cause

  • Changing device names across boot is neither a bug nor a regression.

    • Changing of non-persistent names, such as sda, is neither an error nor regression but is based on how the Linux kernel discovers disks.
    • Often a customer notes that non-persistent names have been consistent across boots on most of their nodes except a few others, or were consistent up until a kernel update. Note that the number and type of scsi hosts and luns on each bus can and will affect discovery timing. If it appears to have been consistent, then its basically luck or a small storage configuration.
    • Also note that each major kernel version attempts to speed up boot time and this means speeding up device discovery. There were major changes within RHEL 8 to increase scsi host discovery in parallel and increase scsi lun discovery in parallel was added within RHEL 9. Each such change can result in the non-persistent device names being even more non-persistent.

    By default, Linux detects and discovers devices in parallel and asynchronously from each other. Typically disk discover gets initialized in an as found order (ordinal fashion), but drive letters are assigned in first response order. So in some cases it will be the same as last boot (lun number/adapter order), while at other times the assigned names will be different due just due to the parallel timing present in discovery. This is not a flaw but is the way linux is designed. Access by device name is not designed to be permanently coupled to any physical disk location or ordering. Instead, accessing disks should be by self-identifier of either the data container (disk) or the data itself (aka PV/LV/filesystem etc.). Disks self-identify by World Wide IDentifier (WWID) or similar, and partitions, PV, LV, filesystems by, typically, some type of UUID (for example a GUID for a GPT partition table entry).

  • The operating system issues I/O to a storage device by referencing the path that is used to reach it. For SCSI devices, the path consists of the following:

    • PCI identifier of the host bus adapter (HBA)
    • channel number on that HBA
    • the remote SCSI target address
    • the Logical Unit Number (LUN)

    This path-based address (e.g. /dev/sdX) is not persistent. It may change any time the system is reconfigured (either by on-line reconfiguration or when the system is shutdown, reconfigured, and rebooted or the path is failed and recovered back). It is even possible for the path identifiers to change when no physical reconfiguration has been done, as a result of timing variations during the discovery process when the system boots, or when a bus is re-scanned. Device names are assigned in per discovered order, which can change based upon storage timing and the parallel discovery process that is employed within the kernel.

  • The operating system provides several non-persistent names to represent these access paths to storage devices. One is the /dev/sdX name; another is the major:minor number. A third is a symlink maintained in the /dev/disk/by-path/ directory. This symlink maps from the path identifier to the current /dev/sdX name.

  • It is generally not appropriate for applications, lvm volumes to use these path-based references (viz. /dev/sdx, major:minor number, and symlink /dev/disk/by-path/). This is because the storage device these paths reference may change, potentially causing incorrect data to be written to the device. Path-based names are also not appropriate for SAN devices, because the path-based names may be mistaken for separate storage devices, leading to uncoordinated access and unintended modifications of the data. In addition, path-based names are system-specific.

In RHEL 9, additional asynchronous lun discovery threads were added to further speed up device discovery at boot time, but at the cost of making the device name assignment even more inconsistent.

 - Please refer to the following section in RHEL 6 storage administration guide for detailed information about the same:
   [25.3. Persistent Naming](https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Storage_Administration_Guide/index.html#persistent_naming)
SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.