How to set dev_loss_tmo and fast_io_fail_tmo persistently, using a udev rule

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL) 6, 7
  • Fibre Channel SAN storage

Issue

  • Need to set fast_io_fail_tmo and dev_loss_tmo
  • Setting must persist across reboot

Resolution

Both the fast_io_fail_tmo and dev_loss_tmo are transport layer timeouts, meaning that they are defined as working with the remote port structure of the fabric, associated with class fc_remote_ports. Since they work in regards to the state of the remote port, to pull the needed udev database information, we must target a rport.

You can target all rports or only specific ones and change the two parameters using udev rules.

  1. All rports. In this rule, we'll target all of our hosts, and every viable rport behind the hosts, and set each to a dev_loss_tmo of 10 and a fast_io_fail_tmo of 5. We match all viable rports by matching the role FCP Target. Create /etc/udev/rules.d/99-tmo.rules and include the below contents.

    ACTION!="add|change", GOTO="tmo_end"
    KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports",  ATTR{roles}=="FCP Target", ATTR{dev_loss_tmo}="10", ATTR{fast_io_fail_tmo}="5"
    LABEL="tmo_end"
    
  2. Target specific rport(s).

    • Select devices of interest
    • Convert device scsi address to rport address
    • Lookup udev attributes of rport to obtain WWNN (node name) and WWPN (port name) of remote target port
    • Create /etc/udev/rules.d/99-tmo.rules and include the udev rule of the following form

ACTION!="add|change", GOTO="tmo_end"
KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports", ATTR{node_name}=="node-wwn", ATTR{port_name}=="port-wwn",  \
    ATTR{dev_loss_tmo}="timeout-seconds", ATTR{fast_io_fail_tmo}="timeout-seconds"
# Repeat udev rules for each report port...
LABEL="tmo_end"

After creating the new udev rules:

  • To apply, reload the rules and database:

        #RHEL6
        [root@host]# udevadm control --reload-rules
        #RHEL7
        [root@host]# udevadm control --reload-rules
    
    
  • Then trigger against the appropriate subsystem:

    [root@host ~]# udevadm trigger --type=devices --action=change
    [root@host ~]# udevadm trigger --subsystem-match=fc_remote_ports
    

 

Note: When setting `eh_deadline` and `eh_timeout` [How to set eh_deadline and eh_timeout persistently, using a udev rule](https://access.redhat.com/solutions/3209481) can be used, and if setting `dev_loss_tmo` on a Cisco UCS system using the `fnic` driver, [Does the fnic driver have a "dev_loss_tmo" setting?](https://access.redhat.com/solutions/3164771) can be used.

Example - Target Select Remote Ports

In this example we'll target individual rports by using the node_name and the port_name of the rport.

Select devices of interest

Taking a look at one of our devices, we see it is a dm-multipath device named /dev/mapper/test_lun. There are 8 paths presented through hosts 2, 3, 4, and 5. Duplicate backend ports are provided through target port 0 and 1 ending at lun 0. In this example we're choosing devices sde 2:0:0:0 and sdh 2:0:1:0 and want the changes to be applied to the remote storage ports associated with these devices.

test_lun (wwid_omitted) dm-4 NETAPP,LUN
size=30G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=50 status=active
| |- 2:0:0:0  sde  8:64   active ready running                <<<<<<<<<<<<<<<
| |- 3:0:1:0  sdn  8:208  active ready running
| |- 4:0:0:0  sdq  65:0   active ready running
| `- 5:0:0:0  sdw  65:96  active ready running
`-+- policy='queue-length 0' prio=10 status=enabled
  |- 2:0:1:0  sdh  8:112  active ready running                <<<<<<<<<<<<<<<
  |- 3:0:0:0  sdm  8:192  active ready running
  |- 4:0:1:0  sdt  65:48  active ready running
  `- 5:0:1:0  sdad 65:208 active ready running

 

**Convert device scsi address to rport address**

In many, but not all cases, the scsi target id (index) 'T' within the scsi H:B:T:L device address is also the assigned report port index value. Using the selected scsi host number (H) in 'rport-2*' string within in the following command, look up the assigned 'T' scsi target id (index) assigned to remote ports on host 2. Notice in this case the assigned scsi target id is also the remote port index (the '1' in 'rport-2:0-1 is the remote port index).

$ grep -Hv "zz" /sys/class/fc_remote_ports/rport-2*/scsi_target_id
/sys/class/fc_remote_ports/rport-2:0-0/scsi_target_id:0
/sys/class/fc_remote_ports/rport-2:0-1/scsi_target_id:1       << scsi 'T' target id is 1, 
                                     ^
                                     +------------------------<< the report port index is also 1
/sys/class/fc_remote_ports/rport-2:0-2/scsi_target_id:-1

 

**Lookup udev attributes of rport to obtain WWNN (node name) and WWPN (port name) of remote target port**

Using the identified rports above, pull the udev information using the udevadm info command.

[root@host ~]# udevadm info --attribute-walk --path=/sys/class/fc_remote_ports/rport-2\:0-0/

  looking at device '/devices/pci0000:00/0000:00:09.0/0000:04:00.0/host2/rport-2:0-0/fc_remote_ports/rport-2:0-0':
    KERNEL=="rport-2:0-0"
    SUBSYSTEM=="fc_remote_ports"
    DRIVER==""
    ATTR{supported_classes}=="Class 3"
    ATTR{dev_loss_tmo}=="30"
    ATTR{node_name}=="0x500a09808607eec3"      << WWNN, world wide node name
    ATTR{port_name}=="0x500a09819607eec3"      << WWPN, world wide port name
    ATTR{port_id}=="0x610400"
    ATTR{roles}=="FCP Target"
    ATTR{port_state}=="Online"
    ATTR{scsi_target_id}=="0"
    ATTR{fast_io_fail_tmo}=="5"

  looking at parent device '/devices/pci0000:00/0000:00:09.0/0000:04:00.0/host2/rport-2:0-0':
    KERNELS=="rport-2:0-0"
    SUBSYSTEMS==""
    DRIVERS==""
[ ... snip ... ]

[root@host ~]# udevadm info --attribute-walk --path=/sys/class/fc_remote_ports/rport-2\:0-1/

  looking at device '/devices/pci0000:00/0000:00:09.0/0000:04:00.0/host2/rport-2:0-1/fc_remote_ports/rport-2:0-1':
    KERNEL=="rport-2:0-1"
    SUBSYSTEM=="fc_remote_ports"
    DRIVER==""
    ATTR{supported_classes}=="Class 3"
    ATTR{dev_loss_tmo}=="2147483647"
    ATTR{node_name}=="0x500a09808607eec3"      << WWNN, world wide node name
    ATTR{port_name}=="0x500a09828607eec3"      << WWPN, world wide port name
    ATTR{port_id}=="0x610500"
    ATTR{roles}=="FCP Target"
    ATTR{port_state}=="Online"
    ATTR{scsi_target_id}=="1"
    ATTR{fast_io_fail_tmo}=="5"

  looking at parent device '/devices/pci0000:00/0000:00:09.0/0000:04:00.0/host2/rport-2:0-0':
    KERNELS=="rport-2:0-1"
    SUBSYSTEMS==""
    DRIVERS==""
[ ... snip ... ]

 

**Create `/etc/udev/rules.d/99-tmo.rules` and include the udev rule of the following form**

From this information we can build a udev rule to set both dev_loss_tmo and fast_io_fail_tmo to be used for the identified remote port.

In the second example, we'll target individual rports, using the node_name and the port_name of the rport.

ACTION!="add|change", GOTO="tmo_end"
KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports", ATTR{node_name}=="0x500a09808607eec3", ATTR{port_name}=="0x500a09819607eec3", ATTR{dev_loss_tmo}="10", ATTR{fast_io_fail_tmo}="5"
KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports", ATTR{node_name}=="0x500a09808607eec3", ATTR{port_name}=="0x500a09828607eec3", ATTR{dev_loss_tmo}="10", ATTR{fast_io_fail_tmo}="5"
LABEL="tmo_end"

Also, see the following on additional details for shortening timeout failover to surviving paths in a fibre channel environment:

To lengthen timeout failure to help prevent filesystems entering read-only mode:

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.