NFSv4 clientid was expired suddenly due to use same hostname on several NFS clients
Environment
- Red Hat Enterprise Linux 7.6 or later
- NFSv4.x
Issue
- NFS server returned NFS4ERR_BADSESSION suddenly, but the GETATTR call just before that was successful.
221402 17:58:28.006327 192.168.122.73 → 192.168.122.72 NFS 258 V4 Call GETATTR FH: 0x9a4cedf5
221403 17:58:28.006409 192.168.122.72 → 192.168.122.73 NFS 310 V4 Reply (Call In 221402) GETATTR
221405 17:58:29.176280 192.168.122.73 → 192.168.122.72 NFS 246 V4 Call GETATTR FH: 0xa202f392
221406 17:58:29.176312 192.168.122.72 → 192.168.122.73 NFS 114 V4 Reply (Call In 221405) SEQUENCE Status: NFS4ERR_BADSESSION
221408 17:58:29.176602 192.168.122.73 → 192.168.122.72 NFS 170 V4 Call DESTROY_SESSION
221409 17:58:29.176622 192.168.122.72 → 192.168.122.73 NFS 114 V4 Reply (Call In 221408) DESTROY_SESSION Status: NFS4ERR_BADSESSION
221410 17:58:29.176769 192.168.122.73 → 192.168.122.72 NFS 266 V4 Call CREATE_SESSION
221411 17:58:29.176797 192.168.122.72 → 192.168.122.73 NFS 114 V4 Reply (Call In 221410) CREATE_SESSION Status: NFS4ERR_STALE_CLIENTID
221412 17:58:29.176991 192.168.122.73 → 192.168.122.72 NFS 306 V4 Call EXCHANGE_ID
221413 17:58:29.177018 192.168.122.72 → 192.168.122.73 NFS 170 V4 Reply (Call In 221412) EXCHANGE_ID
221414 17:58:29.177176 192.168.122.73 → 192.168.122.72 NFS 266 V4 Call CREATE_SESSION
221416 17:58:30.166895 192.168.122.72 → 192.168.122.73 NFS 194 V4 Reply (Call In 221414) CREATE_SESSION
221417 17:58:30.167092 192.168.122.73 → 192.168.122.72 NFS 202 V4 Call PUTROOTFH | GETATTR
221419 17:58:30.167135 192.168.122.72 → 192.168.122.73 NFS 182 V4 Reply (Call In 221417) PUTROOTFH | GETATTR
221420 17:58:30.167473 192.168.122.73 → 192.168.122.72 NFS 318 V4 Call OPEN DH: 0x9a4cedf5/
221421 17:58:30.167517 192.168.122.72 → 192.168.122.73 NFS 166 V4 Reply (Call In 221420) OPEN Status: NFS4ERR_NO_GRACE
221422 17:58:30.167758 192.168.122.73 → 192.168.122.72 NFS 194 V4 Call RECLAIM_COMPLETE
221424 17:58:30.243923 192.168.122.72 → 192.168.122.73 NFS 158 V4 Reply (Call In 221422) RECLAIM_COMPLETE
221425 17:58:30.244109 192.168.122.73 → 192.168.122.72 NFS 326 V4 Call OPEN DH: 0x9a4cedf5/
221427 17:58:30.244178 192.168.122.72 → 192.168.122.73 NFS 386 V4 Reply (Call In 221425) OPEN StateID: 0x0ad9
- And CREATE_SESSION was failed with NFS4ERR_STALE_CLIENTID. The clientid was expired so EXCHANGE_ID was needed to get new clientid.
Resolution
You can use one of the following solutions.
For Bare metal/Virtual/Container systems
- Use unique hostnames for NFS clients.
- Use NFSv3.
- Configure nfs4_unique_id nfs module parameter on each nfs client.
# more /etc/machine-id
92dbe4656355499696d3a2d254c1426f
# echo "options nfs nfs4_unique_id=92dbe4656355499696d3a2d254c1426f" > /etc/modprobe.d/nfs.conf
# lsmod|grep nfs
# modprobe nfs
# more /sys/module/nfs/parameters/nfs4_unique_id
92dbe4656355499696d3a2d254c1426f
For Container systems
- Configure /sys/fs/nfs/net/nfs_client/identifier on each container. (Available in RHEL8.3 or later kernels) For example, a uniquifier might be formed at boot using the container's internal identifier:
$ sha256sum /etc/machine-id | awk '{print $1}' \
> /sys/fs/nfs/net/nfs_client/identifier
- Note that the path
/sys/fs/nfs/client/net/identifierlisted in the kernel-doc is incorrect and will be fixed in the future release.
Root Cause
- When several NFS clients use same hostnames, the default uniform client string may not be sufficiently
unique, making NFS server unable to distinguish between clients. NFS server will determine that the
second is the result of the client restarting, and will invalidate/expire the first clientid, preventing
the first client from communicating. - This issues was tracked in the following bug.
SBR
Product(s)
Components
Tags
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.