How to make sure Oracle ASM devices pointing to multipath devices and not scsi paths, sd devices when using ASMLib to manage ASM disks?
Environment
- Red Hat Enterprise Linux (RHEL) 5, 6, 7, 8
- device-mapper-multipath
- Oracle ASM using ASMLib
Issue
- Oracle application crashes when a single path in multipath fails. The application should be unaware of underlying path failures.
- ASM crashed and with that the Oracle DB
- Using Device Mapper Multipathing for Oracle database, and expect Oracle LUNs to see multipath, not sd devices?
- How to make sure Oracle ASM devices pointing to multipath devices and not scsi paths, sd devices when using
ASMLibto manageASMdisks? - I/O's to the SAN are not shared across all the paths of multipath in Oracle application server configured with Oracleasm.
Resolution
ORACLEASM_SCANORDER should be configured to force the use of the multipath pseudo-device. Since ASM uses entries from /proc/partition, a filter would need to be set to exclude underlying paths.
- Edit
/etc/sysconfig/oracleasmand adddmto the SCANORDER, andsdto SCANEXCLUDE as follows:
# ORACLEASM_SCANORDER: Matching patterns to order disk scanning
ORACLEASM_SCANORDER="dm"
# ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
ORACLEASM_SCANEXCLUDE="sd"
If you are using a 3rd party MPIO package, ORACLEASM_SCANORDER should be set to the corresponding device name used.
- This would require that the
oracleasmconfiguration to be updated:
# oracleasm configure
# oracleasm scandisks
- File
/etc/sysconfig/oracleasmis soft-linked to/etc/sysconfig/oracleasm-_dev_oracleasmwhich is the file used by OracleASM. Verify the soft-link exists.
# ls -al /etc/sysconfig/oracleasm
lrwxrwxrwx 1 root root 39 Feb 22 15:54 /etc/sysconfig/oracleasm -> /etc/ sysconfig/oracleasm-_dev_oracleasm
-
The
oracleasmconfiguration file changes requires a restart of OracleASM service to take effect. This can be disruptive in a production environment.device-mapper: table: 253:<dm_num>: multipath: error getting device device-mapper: ioctl: error adding target to table
Note: It is recommended to schedule a reboot after setting SCANORDER and SCANEXCLUDE in /etc/sysconfig/oracleasm versus just a service restart. Normally a system reboot is not required for oracleasm to start using the multipath devices. However, in (private) RHBZ#1683606, it has been noticed that, while oracleasm was still allowed to detect single paths (before the configuration change and the restart of oracleasm) it could change the value of counters used in device structures within the kernel (block_device.bd_holders) to invalid (negative) values and make the paths appear as being in use. If this happens, restarting only oracleasm will not clear the counters and the devices will continue appearing as being in use. In this case, multipath will still be unable to add the paths to the corresponding maps until the system is rebooted. If this happens, messages similar to the following will be appearing in the system logs whenever multipath tries to add one of those paths to the corresponding map:
The problem can appear even when multipath is using the paths (i.e. the counters can be "silently" changed while the paths are in use by multipath). In such a scenario, the problem will appear in case of an outage, which will cause the paths to be removed from the maps. When the paths return, multipath will be failing to add them to the corresponding maps.
For this reason, it is recommended to schedule a reboot after setting SCANORDER and SCANEXCLUDE in /etc/sysconfig/oracleasm.
- Once restarted, verify the multipath device is being used, a major of 253 should be returned:
# oracleasm querydisk -d <ASM_DISK_NAME>
- Refer to Oracle documentation (Content from support.oracle.com is not included.Doc ID 868352.1: ASMLib Configuration File "/etc/sysconfig/oracleasm" Not Effective) for further information.
Root Cause
When devices were added to the DISKGROUP, the underlying sd* device was used instead of the multipath pseudo device.
The dm-* devices are intended for internal use and are not persistent. However, once the DISKGROUP is created this writes metadata to the device which ASM is then able to check the header regardless of the dm- assignment. The intention here is to force ASM to read from multipath devices.
Diagnostic Steps
-
Query the disk to obtain the
major:minornumber of the disk being used by the disk group:# /etc/init.d/oracleasm querydisk -d ASM_DATA1 Disk "ASM_Data1" is a valid ASM disk on device [8,16] -
We can see that
8:16is the underlyingsdbpath, not theASM_DATA1multipath pseudo device. Failover would not occur with this configuration.ASM_DATA1 (3600500000000000001) dm-24 IBM,2107900 [size=100G][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=0][active] \_ 3:0:1:1 sdb 8:16 [failed][faulty] \_ 5:0:0:1 sdc 8:32 [active][ready] \_ 5:0:1:1 sdd 8:48 [active][ready] \_ 3:0:0:1 sde 8:64 [failed][faulty] -
This can also be see in
/proc/partitions:8 0 142577664 sda 8 1 514048 sda1 8 2 24579450 sda2 8 3 12289725 sda3 8 16 52428800 sdb 8 32 52428800 sdc 8 48 52428800 sdd 8 64 52428800 sde -
The
major:minorof the multipathASM_DATA1pseudo device would be253:24, ordm-24. This is the device that should be used:253 24 52428800 dm-24 -
Note: To check if an
oracleasmdevice is mapped correctly in a vmcore, please see How to map an oracleasm path in a vmcore.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.