An IO storage errors occurs while writing to GFS2 filesystem journal and a withdraw is not triggered on RHEL 6, 7
Environment
- Red Hat Enterprise Linux Server 6, 7 (with the High Availability Add On and Resilient Storage Add Ons)
- A Global Filesystem 2(
gfs2)
Issue
- An IO storage errors occurs while writing to GFS2 filesystem journal and a withdraw is not triggered on RHEL 6, 7
Resolution
Red Hat Enterprise Linux 6
- The issue (bz1505956) has been resolved with errata RHSA-2018:1854 with the following package(s):
kernel-2.6.32-754.el6.
Red Hat Enterprise Linux 7
- The issue (bz1429547) has been resolved with errata RHSA-2018:1062 with the following package(s):
kernel-3.10.0-862.el7or later.
We would recommend that a fsck.gfs2 is performed on the gfs2 filesystem with latest version from gfs2-utils. Please make sure that the filesystem is unmounted on all cluster nodes.
Root Cause
When an error occurs writing to the gfs2 journal there is no error checking to trigger a withdrawal if the journal cannot be written to. A withdrawal should occur if the storage device for the filesystem cannot be read or written to (which includes writing to the journal log).
For example, an IO error occurs while writing to the journal:
Mar 2 14:50:23 node42 kernel: device-mapper: multipath: Failing path 8:80.
Mar 2 14:50:23 node42 kernel: device-mapper: multipath: Failing path 8:96.
Mar 2 14:50:23 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 0
Mar 2 14:50:23 node42 multipathd: sdf: mark as failed
Mar 2 14:50:23 node42 multipathd: mpathb: remaining active paths: 1
Mar 2 14:50:24 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 370800
The journal throws an error but does not withdrawal the filesystem:
Mar 2 14:50:24 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: Error -5 writing to log
Mar 2 14:50:24 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 5901834184
Mar 2 14:50:24 node42 kernel: Buffer I/O error on dev dm-4, logical block 737729017, lost async page write
Mar 2 14:50:24 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 370808
Mar 2 14:50:24 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: Error -5 writing to log
Mar 2 14:50:24 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 0
Mar 2 14:50:24 node42 kernel: blk_update_request: I/O error, dev dm-2, sector 370816
Mar 2 14:50:24 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: Error -5 writing to log
Eventually a withdraw will be thrown the next time that the filesystem is read or written to.
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: fatal: invalid metadata block
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: bh = 151977123 (magic number)
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 437
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: about to withdraw this file system
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: telling LM to unmount
Mar 2 15:09:44 node42 kernel: GFS2: fsid=cluster1:mygfs2fs.1: withdrawn
The withdrawal that is eventually thrown can vary.
Diagnostic Steps
- Review the
/var/log/messagesfile for a storage event related to the gfs2 filesystem and an IO error withdraw.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.