Why am I seeing it take 300s or more to fail a device-mapper-multipath path with a storage array using ALUA in RHEL 5?

Solution Unverified - Updated 7 Aug 2024

Environment

Red Hat Enterprise Linux (RHEL) 5
device-mapper-multipath
Storage array that utilizes ALUA access mode

Issue

Servers connected to a storage array are seeing aborted commands and 300s SCSI command timeouts in /var/log/messages.
We are experiencing storage path failures that take upwards of 300 seconds for device-mapper-multipath to start using the next path

Errors similar to:

  Oct 24 22:53:57 example kernel: qla2xxx 0000:0f:00.0: scsi(4:1:32): Abort command issued -- 1 916a951 2002.
  Oct 24 22:53:57 example kernel: sd 4:0:1:32: timing out command, waited 300s
  Oct 24 22:53:57 example multipathd: /sbin/mpath_prio_alua exitted with 5
  Oct 24 22:53:57 example multipathd: error calling out /sbin/mpath_prio_alua /dev/sdfw

Resolution

It is recommended to update to device-mapper-multipath-0.4.7-48.el5 or later, which has a change that will reduce the timeout of mpath_prio_rdac to 60 seconds, down from 300.

The underlying cause for this issue is that a storage target is unresponsive, and the 300 second timeout is a side-effect. To properly correct this issue and avoid storage disruptions, you should check your storage array, switches, and hosts on the fabric for any potential issues. Some problems that have been known to contribute to these unresponsive targets are:

CRC error on storage ports
ISL-Buffer handling
Bottlenecks and slow draining devices on fabric
Different bandwidths on trunks
Out-of-date firmware on switches

It is recommended that you contact your storage vendor for assistance with diagnosing these unresponsive targets.

Root Cause

The issue here is that I/O commands are timing out, which causes the error handler to kick in (this is where the command aborts come from), and eventually the device exceeds its timeout and fails the path. The time it takes to completely time out a device varies based on several factors, one of which is the longest timeout set on any command at the time the first command times out. See this article for more details on device timeouts.

The problem is that device-mapper-multipath's mpath_prio_alua priority callout for ALUA-based devices sets a 300 second I/O timeout on its callouts. This means that if a device becomes unresponsive while multipathd has a priority callout outstanding, it can cause the path failover to take much longer, causing problems for applications that expect a response in a short amount of time (such as cluster products using a quorum / voting disk). This issue is being tracked in Red Hat Bugzilla #737072.

Even if the mpath_prio_alua I/O timeout can be lowered, lengthy delays may still be experienced while waiting for the device to timeout. As such, it is recommended that you investigate the cause of the unresponsive devices on the fabric.

Product(s)

Red Hat Enterprise Linux

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.