Unmounting a file system that is mounted on a subdirectory of a gfs2 filesystem will fail in RHEL 6 Resilient Storage clusters

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the Resilient Storage Add On
  • A file system of any type mounted on a subdirectory of a Global File System 2 (gfs2) mountpoint
  • kernel releases starting with 2.6.32-431.20.1.el6 up to (but not including) 2.6.32-504.1.3.el6

Issue

  • Mounting a filesystem as a submount on a GFS2 filesystem will cause umount to fail
  • I have a file system mounted on a subdirectory of GFS2, and umount fails with "not found", but seems to leave the device mounted.
# mount | grep gfs2
/dev/dm-2 /mnt/a gfs2 rw,seclabel,relatime,hostdata=jid=0 0 0
/dev/dm-3 /mnt/a/b gfs2 rw,seclabel,relatime,hostdata=jid=0 0 0
# umount /mnt/a/b 
umount: /mnt/a/b: not found
# echo $?
1

Resolution

Workarounds:
  • Use a kernel release prior to 2.6.32-431.20.1.el6
  • If possible, do not mount any file systems on a sub-directory of a gfs2 filesystem
  • If the issue has occurred, reboot the node experiencing issues to recover from the problem and be able to use the file system again.

Root Cause

Red Hat released a fix for this issue for RHEL 6 Update 6 in Bugzilla #1145193 and is evaluating a fix in future releases of RHEL 6 in Bugzilla #1129712.

A change in the VFS in kernel-2.6.32-431.20.1.el6 introduced a bug when unmounting a file system that is mounted on a subdirectory of a gfs2 fs. The specialized way in which gfs2 handles lookups of dentrys during unmount can result in an unexpected failure that can cause a file system to be left in a state which cannot be resolved without a reboot. The initial umount will remove the entry from /etc/mtab, but then will subsequently return a failure after an issue is encountered in looking up the needed entry. Any subsequent attempts to unmount the same device will find that it is not listed in /etc/mtab, despite it still actually being mounted from the kernel's perspective, so those unmount attempts will be unable to clean things up and release the device. The only known way to resolve this is to reboot.

Theoretically it is possible that other file system types that use their own hashing algorithm may be susceptible to the same problem. If experiencing similar symptoms on the affected kernels listed above but with a parent file system type other than gfs2, please contact Red Hat Global Support Services for assistance.

Diagnostic Steps

  • Is a filesystem that is failing to unmount a sub-mount of another GFS2 filesystem?
  • When the GFS2 filesystem is unmounted does it show up in /proc/mounts, but does not show up in output from the command mount? If so, then likely the same issue.
SBR
Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.