How to set a permanent I/O scheduler on one or more specific devices using udev

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 9
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 7

Issue

  • How to persistently set the I/O scheduler on one or more specific disks

Resolution

Like all things linux, there are several ways in which a permanent I/O scheduler can be assigned to one or more particular devices. Using Tuned or UDEV rules are but two choices to accomplish the same task.

Using customized UDEV rules

Step 1: Find the device's identifier(s)

In order to target a device, a persistent disk identifier needs to be located. Device names such as sdN are not guaranteed to be persistent across reboots. When using multipath devices, starting on RHEL6, only the scheduler of the DM device (i.e. /sys/block/dm-X/queue/scheduler) is applicable and takes effect.

The kernel references a disk's persistent identifier as its scsi_id, but the main source of the scsi_id values is the disk's returned WWID (World Wide IDentifier). Such identifiers are self-identifiers; that is the device itself provides this information so regardless of whether it was assigned the name of sda or sdzzz, if its the same device it will return the same WWID value.

There are two methods of finding a suitable persistent disk identifier:

  • use udevadm, and
  • use scsi_id command.

The preferred method is to use udevadm command as this then prevents having to spawn a shell command for each udev rule just in order to find the disk identifier. Instead, the udev rule can reference and entry directly accessible by udev from its internal database. Thus it avoids unnecessary overhead.

udevadm

Use udevadm info command to retrieve a specific disk's udev information. Alternatively the udevadm info --export-db command to retrieve all database entries from udev.

Within the following example, the underlined portion is the device's WWID value. Note that it is used in a number of fields, including the ID_SERIAL field. Note that a 3 character was added to the front of the WWID when added to ID_SERIAL (and other places like multipath). This additional character defines the type and source of the rest of the device's scsi_id string. See "How are SCSI ID generated?" for additional information on this topic.

The highlighted fields below indicate udev fields that contain the device's identifier and could be used within the udev rules to follow. The recommended field to use is the ID_SERIAL. This is the same value as returned by the scsi_id command. However, using the udev field lookup syntax in udev rules will avoid having to fork off a shell just to determine the WWID value that is otherwise directly accessible by and within udev.

  • For SCSI devices, recommended practice is to use ID_SERIAL,
  • For NVMe devices, recommended practice is to use ID_WWN,
  • For DASD devices, recommended practice is to use ID_UID, and
  • For MPATH devices, recommeded practice is to use DM_UUID.

The following assumes we're dealing with a sdN (SCSI disk) device.

# udevadm info --query=all  --name=sdf | egrep "P: |N: |E: ID_|E: SCSI_IDENT|E: ID_"  | egrep "P: |N: |SCSI|SERIAL|WWN|NAA"
P: /devices/pci0000:00/0000:00:01.0/0000:04:00.0/host0/target0:1:0/0:1:0:0/block/sdf
N: sdf
E: ID_SCSI=1
E: ID_SCSI_INQUIRY=1
E: ID_SCSI_SERIAL=50014380212E90E0
E: ID_SERIAL=3600508b1001ca98d5d765bea5a0dd3fa      <<== what scsi_id command returns
E: ID_SERIAL_SHORT=600508b1001ca98d5d765bea5a0dd3fa
E: ID_WWN=0x600508b1001ca98d
E: ID_WWN_VENDOR_EXTENSION=0x5d765bea5a0dd3fa
E: ID_WWN_WITH_EXTENSION=0x600508b1001ca98d5d765bea5a0dd3fa
E: SCSI_IDENT_LUN_NAA_REGEXT=600508b1001ca98d5d765bea5a0dd3fa
E: SCSI_IDENT_LUN_VENDOR=00000000
E: SCSI_IDENT_SERIAL=50014380212E90E0

scsi_id

The scsi_id command will retrieve the same value as is located within udev ID_SERIAL field.

# /lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sdf
3600508b1001ca98d5d765bea5a0dd3fa

Step 2: create the udev rules

The basic udev rule defines when it should be triggered, which devices it is applied to, and finally what action to apply. Since changing a schedule is a low priority change, creating a file such as /etc/udev.d/rules.d/99-ioscheduler.rules.

  • ENV{ ... } will refer the a udev "E: " parameter, so ENV{ID_SERIAL} refers to the "E: SERIAL_ID=3600508b1001ca98d5d765bea5a0dd3fa" field above. The "==" is a comparison operation.
  • ATTR{ ... } will refer to a device based/rooted sysfs attribute, so ATTR{queue/scheduler} will change /sys/block/sdf/queue/scheduler as there is a "=" reference, assign a value to the attribute.

When udev processes an event, it matches the event against any/all applicable rules applied in filename ordinal value range (00-* to 99-*.rules order).

udevadm


ACTION!="add|change", GOTO="iosched_rule_end"      <<== when to trigger, adding or change udev events
ENV{DEVTYPE}!="disk", GOTO="iosched_rule_end"      <<== avoid performing the same filter action per device
#
# Add device filters :: action here for device(s)
ENV{ID_SERIAL}=="3600508b1001ca98d5d765bea5a0dd3fa"  ATTR{queue/scheduler}="kyber"
#
LABEL="iosched_rule_end"
  • If the current udev event is not an add or change event type, the rules are skipped by jumping directly to iosched_rule_end label.
  • If the current udev event is not for a disk, jump directly to iosched_rule_end label.
  • If the disk associated with the current udev event has a value that matches udev's "E: SCSI_ID" field, then
    • Applythe action; change the devices "queue/scheduler" to new supplied value.

You can add additional devices by replicating the filter:action line and changing the ID_SERIAL to the value to match the values for other devices.

scsi_id


ACTION=="add|change", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id --whitelisted --replace-whitespace --device=%N" RESULT=="3600508b1001ca98d5d765bea5a0dd3fa" ATTR{queue/scheduler}="kyber"

Using scsi_id command, although will do the same thing just less efficiently, has the following syntax.

In this case the following filters are applied:

  • If the current udev event action is not add or change, skip.
  • If the current udev event's device is not part of the block subsystem, skip.
  • Fork off a program of scsi_id (the %N is the device name), and if the returned string does not match the value in RESULT, skip.
  • Otherwise, assign kyber to the device's sysfs attribute /sys/.../queue/scheduler.

This is a slightly different form of a udev rule, all in one. Whereas the previous example used filter:goto to skip over rule lines.

Step 3: invoke the updated udev rules

Re-run udev in order to pick-up and then apply the new added udev rules.

# /sbin/udevadm control --reload-rules
# /sbin/udevadm trigger --type=devices --action=change

Step 4: verify the scheduler changes

Verify that the new scheduler was applied.

# cat /sys/block/sdf/queue/scheduler 
mq-deadline [kyber] bfq none

To retest, manually set the scheduler to a different value and repeat steps 3 and 4.

This content is not included.

Examples

Example 1: RHEL 7

Change a multipath device to noop I/O scheduler from current default deadline.


multipath -ll
:
36001405432339bfc0c64e88af039d828 dm-16 LIO-ORG ,ram1
size=10G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
-+- policy='round-robin 0' prio=1 status=active |- 1:0:0:26 sdac 65:192 active ready running |- 2:0:0:26 sdai 66:32 active ready running |- 1:0:1:26 sdks 67:256 active ready running - 2:0:1:26 sdki 66:352 active ready running

cat /sys/block/dm-16/queue/scheduler
noop [deadline] cfq

# Step 1: find the udev identifier to use within the rule # - choose one that has extended scsi_id format of (3)WWID text string # udevadm info --export-db | grep "N: dm-16$" -B 1 -A32
P: /devices/virtual/block/dm-16
N: dm-16
L: 10
S: disk/by-id/dm-name-36001405432339bfc0c64e88af039d828
S: disk/by-id/dm-uuid-mpath-36001405432339bfc0c64e88af039d828
S: mapper/36001405432339bfc0c64e88af039d828
E: DEVLINKS=/dev/disk/by-id/dm-name-36001405432339bfc0c64e88af039d828 /dev/disk/by-id/dm-uuid-mpath-36001405432339bfc0c64e88af039d828 /dev/mapper/36001405432339bfc0c64e88af039d828
E: DEVNAME=/dev/dm-16
E: DEVPATH=/devices/virtual/block/dm-16
E: DEVTYPE=disk
E: DM_ACTIVATION=0
E: DM_NAME=36001405432339bfc0c64e88af039d828 E: DM_SUBSYSTEM_UDEV_FLAG0=1
E: DM_SUSPENDED=0
E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_UDEV_RULES_VSN=2
E: DM_UUID=mpath-36001405432339bfc0c64e88af039d828
E: MAJOR=253
E: MINOR=16
E: MPATH_SBIN_PATH=/sbin
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=152465

:
.
# Step 2: add udev rule to new or existing udev *.rules file # vi /etc/udev/rules.d/99-ioscheduler.rules
ACTION!="add|change", GOTO="iosched_rule_end"
ENV{DEVTYPE}!="disk", GOTO="iosched_rule_end"

ENV{DM_UUID}=="mpath-36001405432339bfc0c64e88af039d828" ATTR{queue/scheduler}="noop"

LABEL="iosched_rule_end"
EOF

# Step 3: reload/invoke updated udev rules # /sbin/udevadm control --reload-rules
/sbin/udevadm trigger --type=devices --action=change

# Step 4: verify the change was applied # cat /sys/block/dm-16/queue/scheduler
[noop] deadline cfq

Example 2: RHEL 8

Change the current sdf device's I/O scheduler from 'mq-deadline' to 'kyber'

STEP 1: # /lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sdf
1IET_00020002

/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sdf
1IET_00020002

multipath -ll
mpathd (1IET_00020002) dm-4 IET,VIRTUAL-DISK
size=3.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| - 4:0:0:2 sdd 8:48 active ready running -+- policy='service-time 0' prio=1 status=enabled
`- 5:0:0:2 sdf 8:80 active ready running

Step 2: Configure udev rule as follows:

cat /etc/udev/rules.d/99-scheduler.rules
ACTION=="add|change", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id --whitelisted --replace-whitespace --device=%N" RESULT=="1IET_00020002" ATTR{queue/scheduler}="kyber"

Step 3: Reload the udev rule:

/sbin/udevadm control --reload-rules
/sbin/udevadm trigger --type=devices --action=change

Step 4: Verify the scheduler:

cat /sys/block/sdf/queue/scheduler
mq-deadline [kyber] bfq none

Example 3: RHEL 8

Change/Set the IO scheduler for all the nvme devices at once 

STEP 1: #udevadm info --query=all --name=nvme0n1 | egrep "P: |N: |E: DEV"
P: /devices/pci0000:85/0000:85:00.0/0000:86:00.0/nvme/nvme0/nvme0n1
N: nvme0n1
E: DEVLINKS=/dev/disk/by-id/nvme-eui.34465a304d4001880025384100000001 /dev/disk/by-path/pci-0000:86:00.0-nvme-1 /dev/disk/by-id/nvme-MO001600KWVNB_S4FZNA0M400188
E: DEVNAME=/dev/nvme0n1
E: DEVPATH=/devices/pci0000:85/0000:85:00.0/0000:86:00.0/nvme/nvme0/nvme0n1
E: DEVTYPE=disk

Step 2: Configure udev rule as follows:

cat /etc/udev/rules.d/98-ioshed.rules


ACTION!="add|change", GOTO="iosched_rule_end" ENV{DEVTYPE}!="disk", GOTO="iosched_rule_end"

ENV{DEVTYPE}=="disk" ENV{DEVNAME}=="/dev/nvme*" ATTR{queue/scheduler}="kyber"

LABEL="iosched_rule_end"

Step 3: Reload the udev rule:

/sbin/udevadm control --reload-rules
/sbin/udevadm trigger --type=devices --action=change

Step 4: Verify the scheduler:

$ cat /sys/block/nvme*/queue/scheduler
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none
mq-deadline [kyber] bfq none

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.