Red Hat Enterprise Linux reports lunXXX has a LUN larger than allowed by the host adapter, where XXX is a very large number

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 9
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 5

Issue

  • Can not see the LUNs presented from the SAN
  • Disks paths with LUN numbers > 255 are not found.
  • Unexpected entries with LUN numbers between 16384-32767 are found
  • Unexpected entries with LUN numbers greater than 65536, commonly 4194304 or larger are reported by the kernel.
  • Getting error message of either:
    • "scsi: host 0 channel 0 id 2 lun16643 has a LUN larger than allowed by the host adapter", or
    • "scsi: host 2 channel 0 id 0 lun4194304 has a LUN larger than allowed by the host adapter"
  • Tried setting lpfc_max_luns to 4200000 and get:
    • "scsi_report_lun_scan: Allocation failure during SCSI scanning, some SCSI devices might not be configured"
  • Why does the kernel remap storage LUN 256 to 16640 when scsi LUN address method 01b is used by storage
    • Will a patch/fix be developed to interpret a large LUN number (>255) in SCSI 01b LUN address format, that is display LUN as 256 and not as 16640.
  • Why does the LUN (logical unit number) as reported in storage differ from the LUN address value as reported by the linux kernel?
  • Why does linux require a workaround fix of increasing the driver maximum LUN to a large value to support LUNs greater than 256?
  • Is there a method to display the LUN id separate from the LUN addressing method employed by storage?
  • My storage vendor has tried changing lun access method, but the problem persists.

Resolution

NOTICE   The root cause of the large LUN numbers is the LUN address format used by storage -- what the SCSI specification references as "LUN Access Method". The linux kernel does not decode or encode the LUN address value returned from storage -- it just displays the LUN address value as provided by storage. Only certain LUN access methods are supportable by the linux kernel. The vendor will know what these are based upon they testing and qualification under linux.
  • Contact storage vendor for assistance.
    • The LUN information returned by storage is not in a format that is compatible for use or supported by the linux kernel and its drivers.
    • If the vendor cannot configure the returned LUN values in a format that is supported by the linux kernel, then select a different vendor.
    • Linux supports
      • SCSI 00b format (with bytes 2-7 returned as zeros from storage): Peripheral device addressing
      • SCSI 01b format (with bytes 2-7 returned as zeros from storage): Flat space addressing
      • Vendor specific linear 16 bit format (with bytes 2-7 returned as zeros from storage)

References

Root Cause

Storage returns 64 bits of lun address. These 64 bits can be in different LUN access formats as defined by the SCSI standard. The full 64 bits must be specified back to storage to properly identify the target lun of the SCSI command.

The linux kernel only saves the upper 32 bits of the full 64 bit LUN address. The requirement is that bytes 4-7 must be zero. Typically these bytes are only used for multi-level addressing within storage and is not supported by linux since they are not saved.

Although the linux kernel saves the upper 32 bits, most storage host bus adapters only support use of the upper 16 bits of the LUN address supplied by storage -- the storage controller hardware fills in bytes 2-7 with zeros when sending the LUN address back to storage.

While the linux kernel does not interpret nor decode the upper 32 bits of LUN address, given that 00b and 01b LUN Access Methods defined by the SCSI specification have all the important information in the upper 0 and 1 bytes, it swaps the upper and lower 16 bits of LUN address when saving them to the internal 32bit lun value. This allows the linux kernel to report the lun address in a more meaningful way:

  • With 00b format of 0x00020000, instead of reporting a value of 131072, it reports a value of 2 post word swap. The linux kernel swaps the words back when providing the LUN address to the HBA driver.

Diagnostic Steps

  • Use sg_luns command from sg3_utils package to view the returned lun access values returned by storage:

      $ sg_luns -d -vv /dev/sda
      open /dev/sda with flags=0x802
          report luns cdb: a0 00 00 00 00 00 00 01 00 00 00 00 
      Lun list length = 48 which imples 6 lun entries
                
      Output response in hex
       00     00 00 00 28 00 00 00 00  00 00 00 40 00 00 00 00                    
       10     00 01 00 40 00 00 00 00  00 02 00 00 00 00 00 00  
       20     41 03 00 00 00 00 00 00  40 04 00 00 00 00 00 00
       30     d2 00 00 04 00 00 00 00                    
      Report luns [select_report=0]:
          0000004000000000
            Peripheral device addressing: lun=0                       ; [3] kernel reports 4194304
          0001004000000000
            Peripheral device addressing: lun=1                       ; [3] kernel reports 4194305
          0002000000000000                       
            Peripheral device addressing: lun=2                       ; [1] kernel reports 2
          4103000000000000 
            Flat space addressing: lun=260                            ; [2] kernel reports 16643
          4004000000000000                        
            Flat space addressing: lun=4                              ; [2] kernel reports 16388
          d200010500000000                         
            Extended flat space logical unit addressing: value=0x105  ; [4] kernel reports 17158656
    
    • [1] SCSI 00b format, Peripheral device addressing, supported by linux
    • [2] SCSI 01b format, Flat space addressing, supported by linux
    • [3] SCSI 01b format, but some of bytes 2-7 are not zeros, unsupported by linux
    • [4] SCSI 11b format, and some of bytes 2-7 are not zeros, unsupported by linux
  • Use sg_luns command to test a specific returned value (kernel displaying value as 4194304 -- this appears to be a vendor specific linear 32 bit lun address which is not supported by the linux kernel):

      $ sg_luns --test 0x0000004000000000
      Decoded LUN:
        Peripheral device addressing: lun=0
    
SBR

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.