On shutdown a cluster node with mounted GFS2 file systems fails to stop cman and leave cluster gracefully and is subsequently fenced

Solution Unverified - Updated 6 Aug 2024

Environment

Red Hat Enterprise Linux (RHEL) 5 with the Resilient Storage Add On
GFS2 filesystems mounted from /etc/fstab
- /etc/init.d/gfs2 shutdown script fails to unmount GFS2 filesystems

Issue

On shutdown a cluster node fails to stop cman and leave cluster gracefully, resulting in a token being lost causing the cluster node to be fenced.

Resolution

If a node has failed to unmount GFS2 file systems and is failing to shutdown, or hanging, as a result, then it may be necessary to forcefully reboot the node.

If a node is being fenced due to failing to unmount GFS2 file systems, then:

Identify any applications or use cases that may not properly cease access to the GFS2 file systems on shutdown and either configure them as a cluster resource to be managed by rgmanager, or configure them to properly stop before the gfs2 service script stops on shutdown.
Ensure that all GFS2 file systems are either listed in /etc/fstab, or are managed by clusterfs resources in /etc/cluster/cluster.conf.

Root Cause

When the gfs2 service (/etc/rc.d/init.d/gfs2) is stopped, it does not make a guarantee that all the GFS2 fs have been unmounted.
The gfs2 init script does try to unmount GFS2 filesystems multiple times, but if it fails:
- the dlm lockspace may remain active, and cman will also fail to shut down cleanly.
- If cman does not exit cleanly, the cluster node could power down without gracefully leaving the cluster, causing other cluster nodes to believe it has failed and to fence it.

Diagnostic Steps

The following symptoms may be systemic of this issue occurring:

Review /var/log/messages at the time when the shutdown occurs to see if cman/openais messages are printed. In example below there is no openais or cman messages about the cluster node shutting down:

May 23 00:10:26 node1 shutdown[9471]: shutting down for system halt
May 23 00:10:28 node1 modclusterd: shutdown succeeded
May 23 00:10:29 node1 rgmanager: [9500]: <notice> Shutting down Cluster Service Manager...
May 23 00:10:29 node1 clurgmgrd[8380]: <notice> Shutting down
May 23 00:10:39 node1 clurgmgrd[8380]: <notice> Disconnecting from CMAN
May 23 00:10:39 node1 clurgmgrd[8380]: <notice> Exiting
May 23 00:10:39 node1 rgmanager: [9500]: <notice> Cluster Service Manager is stopped.
May 23 00:10:39 node1 ricci: shutdown succeeded
May 23 00:10:39 node1 smartd[8538]: smartd received signal 15: Terminated
May 23 00:10:39 node1 smartd[8538]: smartd is exiting (exit status 0)
May 23 00:10:39 node1 avahi-daemon[8326]: Got SIGTERM, quitting.
May 23 00:10:39 node1 avahi-daemon[8326]: Leaving mDNS multicast group on interface bond0.IPv6 with address fe80::217:a4ff:fe77:1424.
May 23 00:10:39 node1 avahi-daemon[8326]: Leaving mDNS multicast group on interface bond0.IPv4 with address 172.26.8.60.
May 23 00:10:39 node1 oddjobd: oddjobd shutdown succeeded
May 23 00:10:40 node1 rhnsd[8283]: Exiting
May 23 00:10:40 node1 saslauthd[8505]: server_exit     : master exited: 8505
May 23 00:10:41 node1 snmpd[8041]: Received TERM or STOP signal...  shutting down...
May 23 00:10:41 node1 xinetd[8091]: Exiting...
May 23 00:11:18 node1 ntpd[8108]: ntpd exiting on signal 15
May 23 00:11:31 node1 hcid[7602]: Got disconnected from the system message bus
May 23 00:11:31 node1 auditd[6578]: The audit daemon is exiting.
May 23 00:11:31 node1 kernel: audit(1369293091.665:69): audit_pid=0 old=6578 by auid=4294967295
May 23 00:11:31 node1 pcscd: pcscdaemon.c:572:signal_trap() Preparing for suicide
May 23 00:11:31 node1 pcscd: hotplug_libusb.c:376:HPRescanUsbBus() Hotplug stopped
May 23 00:11:32 node1 pcscd: readerfactory.c:1379:RFCleanupReaders() entering cleaning function
May 23 00:11:32 node1 pcscd: pcscdaemon.c:532:at_exit() cleaning /var/run
May 23 00:11:32 node1 kernel: Kernel logging (proc) stopped.
May 23 00:11:32 node1 kernel: Kernel log daemon terminating.
May 23 00:11:34 node1 exiting on signal 15
May 23 00:19:44 node1 syslogd 1.4.1: restart.

However, cman is chkconfig'd on at boot time and is therefore expected to stop when shutting down.

    # chkconfig --list cman
    cman           	0:off	1:off	2:on	3:on	4:on	5:on	6:off

The other cluster nodes will believe the node shutting down has failed because it stops transporting the token, and fence it (after it has halted):

May 23 00:10:39 node2 clurgmgrd[8376]: <notice> Member 1 shutting down
May 23 00:11:33 node2 qdiskd[7476]: <info> Node 1 shutdown
May 23 00:13:20 node2 openais[7447]: [TOTEM] The token was lost in the OPERATIONAL state.

Using the following debugging script (added to /etc/rc.d/init.d/gfs2), try to determine which processes have open files on the GFS2 filesystem and are preventing it from unmounting:

Copy the following patch into a file (this patch works with /etc/rc.d/init.d/gfs2 from gfs2-utils-0.1.62-35.el5):

--- /etc/rc.d/init.d/gfs2	2012-07-17 03:26:07.000000000 +1000
+++ /etc/rc.d/init.d/gfs2	2013-07-04 15:47:14.602290572 +1000
@@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# 
+# This is a debug version of /etc/init.d/gfs2. 
 #
 # chkconfig: - 26 74
 # description: mount/unmount gfs2 filesystems configured in /etc/fstab
@@ -19,6 +19,71 @@
 GFS2FSTAB=$(LC_ALL=C awk '!/^#/ && $3 == "gfs2" && $4 !~ /noauto/ { print $2 }' /etc/fstab)
 GFS2MTAB=$(LC_ALL=C awk '!/^#/ && $3 == "gfs2" && $2 != "/" { print $2 }' /proc/mounts)
 
+
+function debug {
+    # This function just prints some data about a GFS2 filesystem to a file. The
+    # files will be in the /tmp directory. There should be 1 file for each retry
+    # of unmounting all the GFS2 fs. Collect all the files and the
+    # /var/log/messages file. Then tarball them up and attach to the ticket.
+
+    touch $1;
+    logger "Stopping all GFS2 file-systems. Retry: $3."
+    echo "Date: $(date)" > $1;
+    echo "Retry = $3" >> $1;
+    echo "Remaining GFS2 FS:" >> $1;
+    for gfs2Mount in $remaining
+      do
+        echo "  $gfs2Mount" >> $1;
+    done;
+
+    # The debugging information to write to the debug file.
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    clustat >> $1;
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    cman_tool services >> $1;
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "mount -l output:\n" >> $1
+    mount -l >> $1;
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "/proc/mounts:\n" >> $1
+    cat /proc/mounts >> $1;
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "/etc/fstab:\n" >> $1
+    cat /etc/fstab >> $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "/etc/mtab:\n" >> $1
+    cat /etc/mtab >> $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    dmsetup info -c >> $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    lsof >> $1;
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    ps -e -o pid,ppid,class,rtprio,ni,pri,psr,pcpu,%mem,stat,wchan:14,start,time,command >>  $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    ls -laR /dev >> $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "losetup -a (if empty then no loopback devices):\n" >> $1
+    losetup -a >> $1
+
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo -e "showmount -e(if empty then no nfs exports):\n" >> $1
+    showmount -e localhost >> $1
+
+    # End
+    echo -e "\n--------------------------------------------------------\n" >> $1;
+    echo "" >> $1
+}
+
 # See how we were called.
 case "$1" in
   start)
@@ -37,8 +102,10 @@ case "$1" in
 		remaining=`LC_ALL=C awk '!/^#/ && $3 == "gfs2" && $2 != "/" {print $2}' /proc/mounts`
		while [ -n "$remaining" -a "$retry" -gt 0 ]
  		do
+			# Adding Debugging here:
+			debug "/var/log/gfs2-debug".$(date +%s)."txt" $remaining $retry;
 			action $"Unmounting GFS2 filesystems: " umount -a -t gfs2
- 
+		   
 			if [ $retry -eq 0 ] 
 			then
 				action $"Unmounting GFS2 filesystems (lazy): " umount -l -a -t gfs2

Make a copy of the existing /etc/rc.d/init.d/gfs2 init script:

# cp /etc/rc.d/init.d/gfs2 /root/gfs2.initscript

Patch /etc/rc.d/init.d/gfs2 file (need patch package installed):

# cd /etc/rc.d/init.d
# patch -p 3 < /tmp/gfs2.patch

Now ensure gfs2 is chkconfig'd on, and reboot the server to generate the logs:

# chkconfig --list gfs2
gfs2           	0:off	1:off	2:off	3:off	4:off	5:off	6:off
# shutdown -r now

Upon booting back up, there should be a log of the system state prior to unmounting gfs2 filesystems in /var/log/gfs-debug..txt

SBR

Clusterha

Product(s)

Red Hat Enterprise Linux

Components

cluster

Category

Troubleshoot

Tags

gfs2
rhel_5

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.