What is the usage of "eh_deadline" & "eh_timeout" and how is it related to scsi device timeout?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 6.5+
  • kernel 2.6.32-431 or later

Issue

  • What is usage of eh_deadline & eh_timeout and how is it related to scsi device timeout?

  • What does it mean when I get an invalid arument error attempting to write to eh_deadline?

      echo 90 > /sys/devices/pci0000:00/0000:0e:00.0/host1/scsi_host/host1/eh_deadline
      echo: write error: Invalid argument
    

Resolution

(eh_deadline) Configuration of Maximum Time for Error Recovery:
A new sysfs parameter eh_deadline has been added to the SCSI host object, which enables configuring the maximum amount of time that the SCSI error handling will attempt to perform error recovery, before giving up and resetting the entire host bus adapter (HBA). The value of this parameter is specified in seconds, and the default is "off", which disables the time limit and allows all of the error recovery to take place. In addition to using sysfs, a default value can be set for all SCSI HBAs using the eh_deadline kernel parameter.

(eh_timeout) Configurable Timeout for Unresponsive Devices:
In certain storage configurations (for example, configurations with many LUNs), the SCSI error handling code can spend a large amount of time issuing commands such as TEST UNIT READY to unresponsive storage devices. A new sysfs parameter, eh_timeout, has been added to the SCSI device object, which allows configuration of the timeout value for TEST UNIT READY and REQUEST SENSE commands used by the SCSI error handling code. This decreases the amount of time spent checking these unresponsive devices. The default value of eh_timeout is 10 seconds, which was the timeout value used prior to adding this functionality.

For example :

  • If we set the scsi_timeout to 30 seconds and eh_deadline to 90 seconds, we'd get (scsi timeout = 30) + (eh_deadline = 90) = 120 seconds plus time to reissue the I/O.
  • No need to adjust eh_timeout because eh_deadline=90 provides plenty of time for the commands to complete.

Invalid Argument Error

Not all drivers will support the eh_deadline feature. When attempting to set the eh_deadline on a scsi_host and it fails with invalid argument this means the driver does not support this feature.

For example, the lpfc driver had this feature added as of the RHEL 7.4 release. See these release notes"

"Also, lpfc now allows you to set the eh_deadline parameter, which represents an upper limit of the SCSI error handling time."

So when attempting to set eh_deadline for scsi hosts serviced by the lpfc driver in RHEL 7.3 and prior, you will get the invalid argument error. Updating to RHEL 7.4 or later will allow you to set the eh_deadline on the lpfc driver.

SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.