What is the lifetime of a glock or a DLM resource on a GFS2 filesystem?

Updated

Introduction


This article tries to explain in simple terms how long GFS2 `glocks` and [DLM (Distributed Lock Manager)](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/High_Availability_Add-On_Overview/index.html#s1-dlm-model) resources are held. This article will demonstrate what happens on the locking tables when a file is created and deleted on a GFS2 filesystem. A GFS2 filesystem uses [DLM (Distributed Lock Manager)](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/High_Availability_Add-On_Overview/index.html#s1-dlm-model) to manage the locks between the cluster nodes and [`glocks`](/articles/35653#Glocks) are used to bridge the DLM and caching into a single state machine.

The article does not try to explain all the interaction between GFS2 and DLM, but provide accounting of what happens with certain types of glocks and DLM resources when a file is created and deleted on a GFS2 filesystem.

Why is DLM required for a GFS2 filesystem?


In order for a GFS2 filesystem to be shared at the same time on multiple cluster nodes there has to be a mechanism to coordinate which cluster node has exclusive access to the resource on the filesystem. The locking protocol used by [DLM (Distributed Lock Manager)](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/High_Availability_Add-On_Overview/index.html#s1-dlm-model) coordinates the locks between the cluster nodes who have the GFS2 filesystem mounted. GFS2 uses locks from the lock manager [DLM (Distributed Lock Manager)](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html#s2-nodeallocate-gfs2) to synchronize access to file system metadata (on shared storage).

The first node to lock a resource becomes the “lock master” for that lock. A locked resource could include an actual object, such as a file, a data structure, a database, or an executable routine, but it does not have to correspond to one of these things. The other nodes may lock that resource, but they have to ask permission from the lock's master first remotely. Each node knows for which locks it is the lock master, and each node knows which node it has lent a lock to. Locking a lock via the master node is much faster than locking one on a non-master node that has to stop and ask permission from the lock's master.

What is a glock?


A [`glock`](/articles/35653#Glocks) is what bridges the DLM locks with the GFS2 filesystem. A more detailed description of a [`glock`](/articles/35653#Glocks) is a: *data structure which brings together the DLM and caching into a single state machine.* When a glock is created, a corresponding DLM lock for the same resource type will be created if one does not already exist.

How long are the locked resources held open?


A DLM locked resource will not be removed from the list of DLM resources **until the corresponding `glock` is removed** from the list of GFS2 glocks. The corresponding glock lifetime is determined by its type.
Type numberGlock typeUse
1TransTransaction Lock
2InodeInode metadata and data
3Resource groupResource group metadata
4MetaThe superblock
5IopenInode last closer detection
6Flockflock(2) syscall
8QuotaQuota operations
9JournalJournal mutex

The table is from the following article.

A type 2 (or inode) glock exists until the GFS2 filesystem is unmounted or there is memory pressure which causes the kernel to request that GFS2 clean up its memory footprint (GFS2 will remove a limited number of unused glocks). In most environments a type 2 (or inode) exists until the filesystem is unmounted. A typical type 5 (or Iopen) glock is held open as long as the resource exists on the GFS2 filesystem, but once the resource is removed from the filesystem then at that point the type 5 (or Iopen) glock will be removed from the list of open glocks and its corresponding DLM locked resource will be removed.

Who is the DLM "lock master"?


One thing to note about the lifetime of locks is that DLM locked resource will only exist if there is a corresponding `glock`. Once that `glock` is removed then the corresponding DLM lock resources will be removed. Some glocks are held for long periods of time like the ones mentioned before: type 2 (inode) and type 5 (Iopen). The implication is that once a cluster node is designated the "lock master" of that resource, it will not relinquish being "lock master" until the glock is removed which causes the DLM locked resources to be removed. This can be a problem if 1 cluster node accesses all (or most) of the resources on a GFS2 filesystem before any other cluster nodes which cause a single node to be the "lock master" of most of the filesystem. When a single cluster node is the "lock master" for all (or most) of a GFS2 filesystem then we are no longer using a Distributed Locking Manager(DLM), but more of a centralized authority to manage locks because all the requests to access a resource by other cluster nodes will be handled remotely by the lock master. This can happen when a [backup]((/solutions/916233)) is run the first time or a [recursive scan]((/solutions/57519)) of the filesystem with some tool.

Dropping the cache will not release any of the "lock masters" since any existing glocks with corresponding DLM resources will not be released. For more information on how the cache works on GFS2, review this article: Is there more information on how GFS2 cache works on RHEL 6 and RHEL 7?

Example


This example will demonstrate what locks are created and removed when a file is created and removed on a GFS2 filesystem. There are two cluster nodes that will have a newly created GFS2 filesystem mounted. *Some of the commands will be done through ssh and will print the `hostname` to clarify which cluster node ran the command.* The example does not try to show how all the interaction between DLM and GFS2 works, but will demonstrate a simple scenario.
Create the GFS2 filesystem and mount the GFS2 filesystem and the debugfs filesystem on all nodes.
# mkfs.gfs2  -t rh6cluster:gfs2-1 -p lock_dlm -j 3 /dev/mapper/vgShared1-lvol1 -O;
# mount -t gfs2 -o noatime /dev/mapper/vgShared1-lvol1 /mnt/gfs2lvol1/;
# mount -t debugfs none /sys/kernel/debug;
Count the number of glocks and DLM “lock master”.
# ssh rh6node1 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l"; ssh rh6node2 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l";
rh6node1.example.com
27
rh6node2.example.com
5
# ssh rh6node1 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l"; ssh rh6node2 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l";
rh6node1.example.com
33
rh6node2.example.com
32
Copy a file to the GFS2 filesystem.
# ssh rh6node1 "cp /var/log/messages /mnt/gfs2lvol1/messages";
Count the number of glocks and DLM “lock master” after a file was created.

After the file was created the number of “lock master” and `glocks` only increased on the rh6node1 cluster node.
# ssh rh6node1 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l"; ssh rh6node2 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l";
rh6node1.example.com
31
rh6node2.example.com
5
# ssh rh6node1 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l"; ssh rh6node2 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l";
rh6node1.example.com
35
rh6node2.example.com
32
Find the new locks created after the file is created.

Find the inode for that file and then convert the inode to hexadecimal.
# ssh rh6node1 "stat /mnt/gfs2lvol1/messages | grep Inode | tr -s ' ' | cut '-d ' -f 3;";
99315
# ssh rh6node1 "echo 'obase=16; 99315' | bc | tr '[:upper:]' '[:lower:]'";
183f3

Then search the DLM and GFS2 lock dumps for that hexadecimal value.

# ssh rh6node1 "hostname; grep 183f3 /sys/kernel/debug/dlm/gfs2-1 -A 1"; ssh rh6node2 "hostname; grep 183f3 /sys/kernel/debug/dlm/gfs2-1 -A 1";
rh6node1.example.com
Resource ffff88001d14c9c0 Name (len=24) "       2           183f3"
Master Copy
Resource ffff88003c2c4580 Name (len=24) "       5           183f3"
Master Copy
rh6node2.example.com

# ssh rh6node1 "hostname;grep 183f3 /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks"; ssh rh6node2 "hostname; grep 183f3 /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks";
rh6node1.example.com
G:  s:EX n:2/183f3 f:yfIqob t:EX d:EX/0 a:2 v:0 r:2 m:200
G:  s:SH n:5/183f3 f:Iqob t:SH d:EX/0 a:0 v:0 r:2 m:200
rh6node2.example.com

Notice that rh6node1 is the only cluster node that a glock and a DLM resource for the inode 183f3. The cluster node rh6node1 has two types of glocks opened: inode and Iopen with corresponding DLM locks.

Delete the file on the GFS2 filesystem.
ssh rh6node1 "rm /mnt/gfs2lvol1/messages";
Count the number of glocks and DLM “lock master” after a file was deleted.
# ssh rh6node1 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l"; ssh rh6node2 "hostname; grep 'Master Copy' /sys/kernel/debug/dlm/gfs2-1 | wc -l";
rh6node1.example.com
30
rh6node2.example.com
5

# ssh rh6node1 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l"; ssh rh6node2 "hostname; grep 'G\:' /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks | wc -l";rh6node1.example.com
34
rh6node2.example.com
32

If we search for that previous inode 183f3 we can see that one of the glocks and DLM resources remain.

# ssh rh6node1 "hostname; grep 183f3 /sys/kernel/debug/dlm/gfs2-1 -A 1"; ssh rh6node2 "hostname; grep 183f3 /sys/kernel/debug/dlm/gfs2-1 -A 1";
rh6node1.example.com
Resource ffff88001d14c9c0 Name (len=24) "       2           183f3"
Master Copy
rh6node2.example.com

# ssh rh6node1 "hostname;grep 183f3 /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks"; ssh rh6node2 "hostname; grep 183f3 /sys/kernel/debug/gfs2/rh6cluster\:gfs2-1/glocks";
rh6node1.example.com
G:  s:EX n:2/183f3 f:yIqLb t:EX d:EX/0 a:0 v:0 r:1 m:200
rh6node2.example.com

A type 2 (or inode) glock exists until the GFS2 filesystem is unmounted or there is memory pressure which causes the kernel to request that GFS2 clean up its memory footprint (GFS2 will remove a limited number of unused glocks). The corresponding DLM lock will remain the "lock master" until the glock is removed. Once the file is deleted the type 5 (or Iopen) is removed from the list of glocks and then the corresponding DLM lock is removed.

Reference

Category
Components
Tags
Article Type