I/O failure seen in KVM when using SCSI passthrough devices on RHEL 8
Environment
-
Red Hat Enterprise Linux 8
-
KVM config using SCSI passthrough. For example:
<disk type='block' device='lun'> <driver name='qemu' type='raw'/> <source dev='/dev/mapper/test' index='1'> <reservations managed='yes'> <source type='unix' path='/var/lib/libvirt/qemu/domain-1-test/test0.sock' mode='client'/> </reservations> <privateData> <nodenames> <nodename type='storage' name='libvirt-1-storage'/> <nodename type='format' name='libvirt-1-format'/> </nodenames> <reservations mgralias='test0'/> </privateData> </source> <backingStore/> <target dev='sda' bus='scsi'/> <alias name='scsi0-0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> <privateData> <qom name='scsi0-0-0-0'/> </privateData> </disk>
Issue
-
I/O failure seen in KVM when using SCSI passthrough devices:
kernel: sd 0:0:0:0: [sda] tag#54 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s kernel: sd 0:0:0:0: [sda] tag#54 Sense Key : Illegal Request [current] kernel: sd 0:0:0:0: [sda] tag#54 Add. Sense: Invalid field in cdb kernel: sd 0:0:0:0: [sda] tag#54 CDB: Write(10) 2a 00 07 9b 08 00 00 0a 00 00 kernel: blk_update_request: critical target error, dev sda, sector 127600640 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 0 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:323: I/O error 5 writing to inode 14 (offset 16777216 size 8388608 starting block 221184) .... kernel: sd 2:0:0:12: [sdn] tag#3214 Send: scmd 0x00000000f30af6e1 kernel: sd 2:0:0:12: [sdn] tag#3214 CDB: Test Unit Ready 00 00 00 00 00 00 kernel: sd 2:0:0:12: [sdn] tag#3214 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s kernel: sd 2:0:0:12: [sdn] tag#3214 CDB: Test Unit Ready 00 00 00 00 00 00 kernel: sd 2:0:0:12: [sdn] tag#3214 scsi host busy 2 failed 0 kernel: sd 2:0:0:12: Notifying upper driver of completion (result 0) kernel: sd 2:0:0:12: [sdn] tag#3214 0 sectors total, 0 bytes done. -
Devices showing a
max_sectors_kbof 1280:[root@host]# cat sys/block/sdc/queue/max_sectors_kb 1280
Resolution
-
Update the system to Red Hat Enterprise Linux 8.6 This can change the
max_sectors_kbpropagation and allow setting to the correct/usable value. -
If updating does not resolve, this issue can be resolved by manually setting the
max_sectors_kbvalue using audevrule. For example:ACTION=="add|change", KERNEL=="sd*", ATTR{queue/max_sectors_kb}="1024" -
Further information on
udevandmax_sectors_kbis below. If assistance is needed in writing audevrule, please reach out to Red Hat Support.
Root Cause
- In some cases, when using SCSI passthrough devices, the sd* devices can end up with a 1280
max_sectors_kb. This causes I/O to failSG_IOlimit checks and fail.
Diagnostic Steps
Noting the max_sectors_kb from the sosreports are all 1280 and from the messages, the failed CDB are >1024 and <=1280.
./sosreport-host1/sys/devices/pci0000:00/0000:00:08.0/virtio6/host0/target0:0:0/0:0:0:0/block/sda/queue/max_sectors_kb:1280
./sosreport-host1/sys/de.../block/vdb/queue/max_sectors_kb:1280
./sosreport-host1/sys/de../block/vda/queue/max_sectors_kb:1280
./sosreport-host1/sys/de../block/vdc/queue/max_sectors_kb:1280
./sosreport-host2/sys/d../block/sdae/queue/max_sectors_kb:1280
./sosreport-host2/sys/d../block/sdan/queue/max_sectors_kb:1280
./sosreport-host2/sys/d../block/sdam/queue/max_sectors_kb:1280
...
./sosreport-host2/sys/d../block/sdn/queue/max_sectors_kb:1280
...
./sosreport-host2/sys/d../block/sdr/queue/max_sectors_kb:1280
./sosreport-host2/sys/d../block/sdz/queue/max_sectors_kb:1280
-
Now from other nodes
# Some node from cluster, which was rebooted. Shows the issue. root@host3:/home/user > cat /sys/block/sda/queue/max_sectors_kb 1280 # Host running on updated KVM RHEL 8.6 - seemingly fixed by update of hypervisor to 8.6. root@host4:/home/user > cat /sys/block/sda/queue/max_sectors_kb 1024 # Host running KVM 8.5 but OS on RHEL 7.7: - No issues seen. root@host5:/home/user >cat /sys/block/sda/queue/max_sectors_kb 512 -
The errors themselves are likely a side effect of false segment merging in the VM guest. The host's SAN storage only allows 256 segments, a fairly typical limit. And the guest's limit is even lower.
./sys/devices/pci0000:85/0000:85:00.0/0000:86:00.1/host4/rport-4:0-5/target4:0:1/4:0:1:12/block/sdz/queue/max_segments:256 ./sys/devices/pci0000:85/0000:85:00.0/0000:86:00.1/host4/rport-4:0-5/target4:0:1/4:0:1:11/block/sdy/queue/max_segments:256 ./sys/devices/pci0000:85/0000:85:00.0/0000:86:00.1/host4/rport-4:0-5/target4:0:1/4:0:1:10/block/sdx/queue/max_segments:256 ./sys/devices/pci0000:85/0000:85:00.0/0000:86:00.1/host4/rport-4:0-5/target4:0:1/4:0:1:9/block/sdw/queue/max_segments:256 ... -
With a
max_sectors_kbof 1024, this allows 256 segments of 4kB pages. With a largermax_sectors_kbvalue, merging would need to combine multiple pages into a segment to meet the segment limit. But the VM guest's view of contiguous pages doesn't match the reality of the host's memory. Pages the guest thinks are contiguous and mergeable likely are not contiguous in the host. And since the host'siommuwas inpassthroughmode,iommusupport won't do anything to merge pages into a combined segment. The attempt to map the command's data in the host can thus get rejected from too many segments.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.