How do I use the top directory (+T) attribute to tune a GFS2 file-system in RHEL 6?
Environment
- Red Hat Enterprise Linux (RHEL) Server 6.5+ with the Resilient Storage Add On
kernel-2.6.32-431.el6or later- GFS2
Issue
- How do I use the top directory (+T) attribute to tune a GFS2 file-system?
- Is there a way to tell GFS2 to allocate newly created directories in separate resource groups, to avoid contention between them?
Resolution
Update to kernel-2.6.32-431.el6 or later to use the Orlov block allocator for top-level directories in GFS2 file systems.
Sub-directories created in the top-level directory of a file system will automatically take advantage of the new allocator and be placed in a separate resource group.
For any directories that are not at the top-level of the file system, but whose sub-directories will not be related to each other, set the +T attribute using chattr:
# chattr +T /path/to/directory
Using Orlov block allocator on existing GFS2 filesystem
- Only newly created directories in top-level (or
+T) directories will be allocated in a new resource group. Existing directories will not be retrospectively moved to another resource group. - If an existing GFS2 filesystem is not configured for
Orlov block allocatorand then is reconfigured to useOrlov block allocatorthe performance will likely not be as good as a new GFS2 filesystem setup to take advantage of theOrlov block allocator. Once the filesystem has been used the metadata becomes dirty making it difficult for a used GFS2 to take full advantage ofOrlov block allocator. We recommend that a new GFS2 filesystem be used and configured as recommend forOrlov block allocatorto get ideal performance gains. For instance, create all the sub-directories that will be used and then enable theOrlov block allocatorbit.
Root Cause
The Orlov block allocator provides better locality for files which are truly related to each other and likely to be accessed together. In addition, when resource groups are highly contended, if they are unrelated a different group is used which should enhance performance.
By default, the Orlov Block Allocator algorithm is enabled on the GFS2 root file-system and cannot be disabled. Just like ext3, Orlov Block Allocator works on the root directory and any directory with the +T attribute set. Any sub-directory created in one of the just mentioned cases will be allocated to a separate, random resource group (GFS2 equivalent of a block group).
If you are creating a set of directories, each of which will contain a job running on a different cluster node, then by setting +T attribute on the parent directory before creating the sub-directories, each will land in a different resource group, and thus resource group contention between cluster nodes will be kept to a minimum.
For information on this option please see the man page for chattr. Below is the documentation from that man page:
# man chattr
A directory with the ’T’ attribute will be deemed to be the top of directory hierarchies for the purposes of
the Orlov block allocator. This is a hint to the block allocator used by ext3 and ext4 that the
sub-directories under this directory are not related, and thus should be spread apart for allocation
purposes.
For example it is a very good idea to set the ’T’ attribute on the /home directory, so that /home/john and
/home/mary are placed into separate block groups. For directories where this attribute is not set, the
Orlov block allocator will try to group sub-directories closer together where possible.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.