How to replace a failed glusterfs brick from new brick having same name as old brick in same glusterfs node ?

Solution Verified - Updated 5 Aug 2024

Environment

Red Hat Storage Server 2.1
Red Hat Storage Server 3.0
Red Hat Storage Server 3.1
Red Hat Storage Server 3.2

Issue

How to replace a failed glusterfs brick from new brick having same name as old brick in same glusterfs node.
Need to replace a failed brick from new brick having same name as old brick in same node.
Steps to replace a failed brick from new brick having same name as old brick in same node.

Resolution

Note that in RHGS 3.2, there is a new command reset-brick that can be used to do the same thing.

Please check out

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.2/html/administration_guide/sect-migrating_volumes#sect-Migrating_Volumes-Reconfigure_Brick

11.5.5. Reconfiguring a Brick in a Volume

============================
For older releases:

These below given steps assume that new glusterfs brick XFS partition is mounted in same path as old brick was and /etc/fstab has entry of it.

If XFS filesystem is mounted on /bricks then create brick directories inside /bricks after checking other active bricks path , for example brick path is /bricks/brick1:
```
#mkdir -p /bricks/brick1
```

Create the .glusterfs directory

#mkdir -p /bricks/brick1/.glusterfs/00/00

Create symlink for root of the brick inside from /bricks/brick1/.glusterfs/00/00

#cd  /bricks/brick1/.glusterfs/00/00
#ln -s ../../.. 00000000-0000-0000-0000-000000000001

ll should return this:

lrwxrwxrwx 1 root root <date> <time> 00000000-0000-0000-0000-000000000001 -> ../../..

/bricks/brick1 should have a pair replica brick in any of the node of trusted pool, check the details of it.
Login to that node as later we need to verify the data from this node to new brick node.
Also check the "trusted.glusterfs.volume-id" from the pair replica brick node and get the volume-id, for example pair replica brick has name /bricks/brick2:
```
   #getfattr -d -m. -e hex /bricks/brick2

   trusted.glusterfs.volume-id= <volume id>
```
Go back to new brick node and set trusted.glusterfs.volume-id to /bricks/brick1:
```
#setfattr -n trusted.glusterfs.volume-id -v <volume id> /bricks/brick1
```

After setting the volume-id verify it:

#getfattr -d -m. -e hex /bricks/brick1
trusted.glusterfs.volume-id= <volume id>

If volume is in stop state start the volume:

   #gluster volume start VOLNAME
   Check #gluster volume status to check new brick came online  and have pid and portnumber 

   If brick is not online restart the volume:
   
   #gluster volume stop VOLNAME force
   #gluster volume start VOLNAME

Run full self-heal:
```
#gluster volume heal VOLNAME full
```
Compare the data from replica brick /bricks/brick2 to this new brick /bricks/brick1, it should have same data.

Root Cause

Because of hardware issue, brick got corrupted and we have again formatted the this brick with XFS Filesystem and now we want to attach this new XFS formatted brick to same glusterfs volume with same name as old brick was having in same glusterfs node.

SBR

Filesystem

Product(s)

Red Hat Storage Server

Components

glusterfs-server

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.