Configuring device-mapper-multipath for an EMC DGC storage configured in ALUA mode with RHEL 5
Environment
- Red Hat Enterprise Linux (RHEL) 5 Update 4 or later
- EMC CLARiiON storage configured in active-active ALUA mode
- device-mapper-multipath
Issue
- How to configure multipath for EMC CLARiiON and VNX for ALUA
- Multipath issues on EMC CLARiiON or VNX storage
- An EMC DGC storage device has been configured to use ALUA mode, and device-mapper-multipath on the host must be configured to use explicit ALUA mode
- My EMC DGC storage SAN paths are configured in active-active ALUA mode, what configuration do I need to apply to device-mapper-multipath?
- My EMC Clariion SAN paths are configured in active-active ALUA mode, what configuration do I need to apply to device-mapper-multipath?
- If I try to access paths (/dev/sdN) directly, the disks in the lower priority group will fail. Multipath device access still works OK when disabling the paths in the high priority group.
- Multipath displaying "kernel: Buffer I/O error on device sdX, logical block 0" with EMC CLARiiON SAN
- My system mounts LUNs on an EMC Clariion SAN. I see entries in /var/log/messages stating
kernel: Buffer I/O error on device sdX, logical block 0
- Storage LUNs automatically disappear when loaded.
Resolution
Add and/or update the devices clause for DGC (CLARiiON) arrays within multipath.conf[1]. Please see the information within the latest available "EMC Host Connectivity Guide for Linux" from EMC. Failing to following EMC guidelines can result in the storage configuration being unsupported by EMC.
As an example, the following is the defined clause from "EMC Host Connectivity Guide for Linux", Rev A36 dated Jan 2014 pgs 250/251 under "ALUA RHEL 5/ RHEL6" section:
devices {
:
.
# Device attributed for EMC CLARiiON and VNX series ALUA
device {
vendor "DGC"
product ".*"
prio_callout "/sbin/mpath_prio_alua /dev/%n"
path_grouping_policy group_by_prio
features "1 queue_if_no_path"
failback immediate
hardware_handler "1 alua"
path_checker readsector0
}
:
.
}
NOTE:The above items in red are the sections that were changed between active/standby (pnr) and active/active (alua) CLARiiON configurations. Note that the path_checker is changed from emc_clariion to just accepting the default (that is, an explicit path-checker type is not specified). The ones expected by EMC are currently either readsector0 or tur. An explicit path_checker selection has been shown above to eliminate ambiguity.You should update your initrd image in order to ensure changes are picked up. See Why are changes I made to /etc/multipath.conf not taking effect on boot on my boot-from-multipath RHEL system? for specific steps. After changing multipath.conf, reboot or restart the multipath services for the changes to take effect.
# service multipathd reload
\--\--\----------------------------------------------------------------------------------------------------------- Footnotes: [1] To clarify the instructions, there are two starting states for DGC/CLARiiON storage and multipath:
- a customized clause is present within /etc/multipath.conf, or
- no customized clause is present within /etc/mulitpath.conf and the built-in default clause is being used.
In case #1, the information in /etc/multipath.conf needs to have the three highlighted fields above (prio_callout, hardware_handler, and path_checker) updated new their new desired values.
In case #2, get a copy of the built-in default clause by using the multipathd -k"show config" command, for example:
# multipathd -k"show config"
:
device {
vendor "DGC"
product ".*"
product_blacklist LUNZ
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_selector "round-robin 0"
path_checker emc_clariion
features "1 queue_if_no_path"
hardware_handler "1 emc"
prio_callout "/sbin/mpath_prio_emc /dev/%n"
failback immediate
rr_weight uniform
no_path_retry 60
rr_min_io 1000
}
Copy that clause into the /etc/multipath.conf and change the three (bolded) fields that need to be changed.
devices {
:
device {
vendor "DGC"
product ".*"
product_blacklist LUNZ
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_selector "round-robin 0"
path_checker readsector0
features "1 queue_if_no_path"
hardware_handler "1 alua"
prio_callout "/sbin/mpath_prio_alua /dev/%n"
failback immediate
rr_weight uniform
no_path_retry 60
rr_min_io 1000
}
}
The underlined fields above are somewhat redundant as they are the same value as the multipath defaults that will be used by any and all device clauses that do not specify thier own specific value. For example:
# multipathd -k"show config"
defaults {
verbosity 2
polling_interval 5
udev_dir "/dev"
path_selector "round-robin 0"
path_grouping_policy failover
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout none
features "0"
path_checker readsector0
failback manual
rr_min_io 1000
max_fds max
rr_weight uniform
queue_without_daemon no
flush_on_last_del no
user_friendly_names yes
pg_prio_calc avg
log_checker_err always
bindings_file "/var/lib/multipath/bindings"
file_timeout 90
}
NOTE: Always use multipathd -k"show config" to obtain the information on the system as different multipath versions can have variances within the defaults and compiled in DGC storage clauses.
Root Cause
The device-mapper-multipath default information within multipath.conf for EMC CLARiiON storage is setup for active-passive configurations and uses the "1 emc" hardware handler, etc.
When CLARiiON storage is configured in active-active (alua) mode the multipath.conf needs to be modified to use the correct hardware handler, path checker, and priority callout routine for same.
Asymmetric Logical Unit Access (ALUA) support in device-mapper-multipath was This content is not included.updated in Red Hat Enterprise Linux 5.4, adding explicit ALUA support for Clariion storage. Earlier versions of Red Hat Enterprise Linux 5 added support for implicit ALUA (i.e. the operating system is not aware of which storage device paths have optimized performance and which have non-optimized performance). If the operating system consistently sends I/O on a non-optimized path, then the storage device may transparently make that path optimized, improving performance and causing idle paths to become non-optimized.
Red Hat Enterprise Linux 5.4 introduces explicit ALUA support for Clariion storage (i.e. the operating system exchanges information with the storage device and is able to select the paths that have optimized performance).
Diagnostic Steps
Reviewing multipath -ll output shows hwhandler=1 emc present for CLARiiON storage, which is for Clariion active-passive (passive not ready -- PNR) mode. Althernatively, performing a multipathd -k"show config" can be used to show what values are currently present/used within the kernel. If the "1 emc" handler continues to be present within the "show config" output even after changing multipath.conf with "1 alua", then multipath hasn't been restarted correctly.
Having paths show up as something like the following:
mpath1 (360000000000000000000000000000001) dm-2 DGC,VRAID [size=20G][features=1 queue_if_no_path][hwhandler=1 emc][rw] \_ round-robin 0 [prio=2][active] \_ 0:0:0:16 sdc 8:32 [active][ready] \_ 1:0:1:16 sdo 8:224 [active][ready] \_ round-robin 0 [prio=0][enabled] \_ 0:0:1:16 sdg 8:96 [active][ready] \_ 1:0:0:16 sdk 8:160 [active][ready]
The calculated/presented priorities highlighted above are incorrect for active-active (alua) configuration. A priority of 0 is the nominal priority for a standby path. So having priorities present as above is an indication that multipath is configured in active-passive mode. Typical priority pairs for active-active (alua) would be something like 50 and 10.
To verify that the CLARiiON storage is configured in active/active (alua) mode, an sg_rtpg command can be performed. The sg_rtpg command is available from the optional sg3_utils package. A scsi Report Target Port Groups (RTPG) command is sent to the specified device and the returned data is decoded. This is the same command/data used within multipath for ascertaining port status within alua configurations.
# sg_rtpg -d /dev/sdN
:
target port group asymmetric access state : 0x01 (active/non optimized)
:
target port group asymmetric access state : 0x00 (active/optimized)
:
The full set of asymmetric access state values defined by the scsi specification are:
- 0h Active/Optimized
- 1h Active/Non-optimized
- 2h Standby
- 3h Unavailable
- 4h-Eh Reserved
- Fh Transitioning between states
These values will be shown within the RTPG command output. See Engineering Notes - scsi INQUIRY and REPORT TARGET PORT GROUPS commands for more information on RTPG.
Once the multipath.conf is setup correctly, the paths will show up something like the following:
mpath1 (360000000000000000000000000000001) dm-2 DGC,VRAID [size=20G][features=1 queue_if_no_path][hwhandler=1 alua][rw] \_ round-robin 0 [prio=100][active] \_ 0:0:0:16 sdc 8:32 [active][ready] \_ 1:0:1:16 sdo 8:224 [active][ready] \_ round-robin 0 [prio=50][enabled] \_ 0:0:1:16 sdg 8:96 [active][ready] \_ 1:0:0:16 sdk 8:160 [active][ready]
The handler is alua and the path priorities show active access via non-zero priority values. The actual priority values may be different - 50,20 or 50,10 for example, or other similar pairs of values.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.