How to recover from a situation where two directories have the same GFID?
Environment
- Red Hat Storage Server 3.x
Issue
- Certain glusterFS operations like a directory rename may cause a situation where you can end up with 2 directories with the same GFID.
- Why it is not able to list all the contents of the original directory from the mount point.?The files continue to exist on the bricks but are not accessible from the mount point.
Resolution
dir1anddir2are considered to have the same GFID in this example.
If you observe this issue while using Geo-replication or Quota, contact your Red Hat Support Representative. Performing the following steps to recover from the error are only valid if you are not using Geo-replication or Quota.
-
These steps must be performed after disconnecting all the clients from the volume. Do not perform any add-brick, remove-brick, rebalance, or snapshot operations while following these steps.
-
Stop the volume by running:
# gluster volume stop <VOL_NAME>
- Identify the directories which have the same GFIDs by searching for the following message in the GlusterFS volume brick logs.
*NOTE: This message will only be found in `Red Hat Storage Server 3.0 brick logs
mkdir (dir1): gfid (<xxx-xxx-xxx-xx>) is already associated with directory (<.glusterfs symbolic link path to dir2>)
Hence, both directories will share same GFID and this can lead to inconsistencies.
- Run the following command on a brick to get the absolute path of of <.glusterfs symbolic link path to dir2>.
# cd <.glusterfs symbolic link path to dir2>
# pwd -P
- Get the GFID value for the directories by executing the following command on each brick.
# getfattr -n trusted.gfid -e hex <path to dir1 on brick>
# getfattr -n trusted.gfid -e hex <path to dir2 on brick>
- Confirm that the GFIDs are the same for both dir1 and dir2 on all the bricks. Save the value of the returned GFID as it will be needed a later point. If the GFID is not the same for both directories, please contact Red Hat Support.
- Delete the trusted.gfid extended attribute for dir1 and dir2 on all the bricks. Run the following command on each brick.
# setfattr -x trusted.gfid <path to dir1 on brick>
# setfattr -x trusted.gfid <path to dir2 on brick>
- Ensure that the attribute no longer exists for these directories by running the commands mentioned in step 3.
- Delete the .glusterfs symbolic link to dir1 and dir2 on all the bricks.
- Each directory on the brick is a target of a symbolic link in the .glusterfs hidden directory at the root of each brick. The path for the symlink file is created by taking the first two characters of the GFID as the first path component, then the next two characters as the next component, and then the complete GFID becomes the name of the symbolic link.
- For example, if the GFID returned is 0xf43df99cda994a9aab6f5bdea580cddc, the complete file path for the symbolic link will be
/.glusterfs/f4/3d/f43df99c-da99-4a9a-ab6f-5bdea580cddc - The link above points to dir1 or dir2:
# rm -f <path to the brick's export root>/.glusterfs/f4/3d/f43df99c-da99-4a9a-ab6f-5bdea580cddc
- Start the volume by running:
# gluster volume start <VOL_NAME>
- Use a FUSE mount to mount the volume and run
ls -l <path to dir1> <path to dir2>from the mount point. This generates new GFIDs for these directories. All the contents of the directories will be visible from the mount point. - Recover the contents of these directories. It is recommended that you carefully compare the contents of the two directories to ensure data is not inadvertently overwritten or deleted during the recovery.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.