NFSv4 clientid was expired suddenly due to use same hostname on several NFS clients

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7.6 or later
  • NFSv4.x

Issue

  • NFS server returned NFS4ERR_BADSESSION suddenly, but the GETATTR call just before that was successful.
 221402 17:58:28.006327  192.168.122.73 → 192.168.122.72  NFS 258 V4 Call GETATTR FH: 0x9a4cedf5
 221403 17:58:28.006409  192.168.122.72 → 192.168.122.73  NFS 310 V4 Reply (Call In 221402) GETATTR
 221405 17:58:29.176280  192.168.122.73 → 192.168.122.72  NFS 246 V4 Call GETATTR FH: 0xa202f392
 221406 17:58:29.176312  192.168.122.72 → 192.168.122.73  NFS 114 V4 Reply (Call In 221405) SEQUENCE Status: NFS4ERR_BADSESSION
 221408 17:58:29.176602  192.168.122.73 → 192.168.122.72  NFS 170 V4 Call DESTROY_SESSION
 221409 17:58:29.176622  192.168.122.72 → 192.168.122.73  NFS 114 V4 Reply (Call In 221408) DESTROY_SESSION Status: NFS4ERR_BADSESSION
 221410 17:58:29.176769  192.168.122.73 → 192.168.122.72  NFS 266 V4 Call CREATE_SESSION
 221411 17:58:29.176797  192.168.122.72 → 192.168.122.73  NFS 114 V4 Reply (Call In 221410) CREATE_SESSION Status: NFS4ERR_STALE_CLIENTID
 221412 17:58:29.176991  192.168.122.73 → 192.168.122.72  NFS 306 V4 Call EXCHANGE_ID
 221413 17:58:29.177018  192.168.122.72 → 192.168.122.73  NFS 170 V4 Reply (Call In 221412) EXCHANGE_ID
 221414 17:58:29.177176  192.168.122.73 → 192.168.122.72  NFS 266 V4 Call CREATE_SESSION
 221416 17:58:30.166895  192.168.122.72 → 192.168.122.73  NFS 194 V4 Reply (Call In 221414) CREATE_SESSION
 221417 17:58:30.167092  192.168.122.73 → 192.168.122.72  NFS 202 V4 Call PUTROOTFH | GETATTR
 221419 17:58:30.167135  192.168.122.72 → 192.168.122.73  NFS 182 V4 Reply (Call In 221417) PUTROOTFH | GETATTR
 221420 17:58:30.167473  192.168.122.73 → 192.168.122.72  NFS 318 V4 Call OPEN DH: 0x9a4cedf5/
 221421 17:58:30.167517  192.168.122.72 → 192.168.122.73  NFS 166 V4 Reply (Call In 221420) OPEN Status: NFS4ERR_NO_GRACE
 221422 17:58:30.167758  192.168.122.73 → 192.168.122.72  NFS 194 V4 Call RECLAIM_COMPLETE
 221424 17:58:30.243923  192.168.122.72 → 192.168.122.73  NFS 158 V4 Reply (Call In 221422) RECLAIM_COMPLETE
 221425 17:58:30.244109  192.168.122.73 → 192.168.122.72  NFS 326 V4 Call OPEN DH: 0x9a4cedf5/
 221427 17:58:30.244178  192.168.122.72 → 192.168.122.73  NFS 386 V4 Reply (Call In 221425) OPEN StateID: 0x0ad9
  • And CREATE_SESSION was failed with NFS4ERR_STALE_CLIENTID. The clientid was expired so EXCHANGE_ID was needed to get new clientid.

Resolution

You can use one of the following solutions.

For Bare metal/Virtual/Container systems

  • Use unique hostnames for NFS clients.
  • Use NFSv3.
  • Configure nfs4_unique_id nfs module parameter on each nfs client.
 # more /etc/machine-id
 92dbe4656355499696d3a2d254c1426f

 # echo "options nfs nfs4_unique_id=92dbe4656355499696d3a2d254c1426f" > /etc/modprobe.d/nfs.conf
 # lsmod|grep nfs 
 # modprobe nfs
 # more /sys/module/nfs/parameters/nfs4_unique_id
 92dbe4656355499696d3a2d254c1426f

For Container systems

  • Configure /sys/fs/nfs/net/nfs_client/identifier on each container. (Available in RHEL8.3 or later kernels) For example, a uniquifier might be formed at boot using the container's internal identifier:
$ sha256sum /etc/machine-id | awk '{print $1}' \
      > /sys/fs/nfs/net/nfs_client/identifier
  • Note that the path /sys/fs/nfs/client/net/identifier listed in the kernel-doc is incorrect and will be fixed in the future release.

Root Cause

Components
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.