'partprobe' command results in 'blk_cloned_rq_check_limits' errors and dm-multipath path failures

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7 with kernel version older than:
    o 3.10.0-514.51.1.el7
    o 3.10.0-693.35.1.el7

  • device-mapper-multipath

Issue

  • For application, filesystem requirements we had tuned max_sectors_kb option using steps in article 3014361, but as soon as the partprobe command was executed by a user, following errors were logged and sub paths to multipath devices were failed:

      kernel: blk_cloned_rq_check_limits: over max size limit.
      kernel: device-mapper: multipath: Failing path 8:128.
      kernel: blk_cloned_rq_check_limits: over max size limit.
      multipathd: sdi: mark as failed
      kernel: device-mapper: multipath: Failing path 8:144.
      kernel: blk_cloned_rq_check_limits: over max size limit.
      kernel: device-mapper: multipath: Failing path 8:64.
      multipathd: mpatha: remaining active paths: 3
      kernel: blk_cloned_rq_check_limits: over max size limit.
      kernel: device-mapper: multipath: Failing path 65:32.
    
  • While further checking the max_sectors_kb value for devices, it was found that after running partprobe the value was changed:

      $ cat /sys/block/sde/queue/max_sectors_kb
      1024
    

    before running partprobe command above value was set to 4096.

Resolution

  • Update the kernel to below or later versions which contains the fix for bug described in Root Cause section:

    RHEL 7.5 kernel-3.10.0-862.el7 Errata: RHSA-2018:1062
    RHEL 7.4.z kernel-3.10.0-693.35.1.el7 Errata RHBA-2018:2158
    RHEL 7.3.z kernel-3.10.0-514.51.1.el7 Errata RHSA-2018:1737

Root Cause

  • The recent enhancements in RHEL 7.4 kernel as listed below, adds ability to check optimal transfer length advertised by the device. This information is returned by the device in response to a SCSI INQUIRY command sent to it.

      938e472 [scsi] sd: Optimal I/O size is in bytes, not sectors
      20c0ea8 [scsi] sd: Reject optimal transfer length smaller than page size
      ced406f [scsi/block] sd: Fix device-imposed transfer length limits
      d98ea39 [scsi] scsi_sysfs: Fix queue_ramp_up_period return code
      5d36ce1 [scsi] scsi: Export SCSI Inquiry data to sysfs			<----------
      c865292 [scsi] sd: Fix maximum I/O size for BLOCK_PC requests
      0e1794a [scsi] scsi_scan: fix queue depth initialisation problem
      5e6288d [scsi] add 1024 max sectors black list flag
      29685d6 [scsi] sd: Fix max transfer length for 4k disks
      5efe668 [scsi] sd: Limit transfer length
    
  • The value in device's /sys/block/sdXX/queue/max_sectors_kb file path is also populated from the value fetched from device's advertised optimal transfer length. Once the partprobe command is executed, it is fetching the above information from device.

    This sets the max_sectors_kb value for a device based upon its advertised optimal transfer length. Due to this, any custom changes to the max_sectors_kb limit for a device are overwritten. This accidental change to the max_sectors_kb value was resulting in request_queue limit violations for in-flight IOs and IO requests were getting failed:

      kernel: blk_cloned_rq_check_limits: over max size limit.
      kernel: device-mapper: multipath: Failing path 8:128.
      kernel: blk_cloned_rq_check_limits: over max size limit.
      multipathd: sdi: mark as failed
    
  • This issue was being tracked in private BZ#1507941. As the part of fix for this BZ, there is a new patch added in kernel to prevent overwriting custom max_sectors_kb limit set for the device.

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.