Ceph/ODF: MDS Crashing (CLBO), the crash backtrace shows "_unlink_" or "_unlink_local".
Environment
Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.x
Ceph File System (CephFS)
Issue
MDS Crashing (CLBO), the crash backtrace shows _unlink_ or _unlink_local.
The Ceph metadata daemon is crashing frequently and unlink is seen in the backtrace of the crash
/builddir/build/BUILD/ceph-14.2.11/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f7848acc700 time 2022-02-23 08:48:31.877094
/builddir/build/BUILD/ceph-14.2.11/src/mds/Server.cc: 7023: FAILED ceph_assert(in->first <= straydn->first)
ceph version 14.2.11-208.el8cp (6738ba96f296a41c24357c12e8d594fbde457abc) nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x7f78578a9308]
2: (()+0x275522) [0x7f78578a9522]
3: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0xfbc) [0x558e8d56654c]
4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xd4c) [0x558e8d56b73c]
5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xaab) [0x558e8d58122b]
6: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x402) [0x558e8d5819a2]
7: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x12a) [0x558e8d58e44a]
8: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0xa94) [0x558e8d4f7344]
9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x80f) [0x558e8d4f975f]
10: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x16) [0x558e8d4f9d66]
11: (MDSContext::complete(int)+0x7f) [0x558e8d79b5df]
12: (MDSRank::_advance_queues()+0xac) [0x558e8d4f86ec]
13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1ed) [0x558e8d4f913d]
14: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x16) [0x558e8d4f9d66]
15: (MDSContext::complete(int)+0x7f) [0x558e8d79b5df]
16: (MDSRank::_advance_queues()+0xac) [0x558e8d4f86ec]
17: (MDSRank::ProgressThread::entry()+0x45) [0x558e8d4f8e25]
18: (()+0x817a) [0x7f785568917a]
19: (clone()+0x43) [0x7f78541a0dc3]
debug 2022-02-23 08:48:31.877 7f7848acc700 -1 /builddir/build/BUILD/ceph-14.2.11/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f7848acc700 time 2022-02-23 08:48:31.877094
/builddir/build/BUILD/ceph-14.2.11/src/mds/Server.cc: 7023: FAILED ceph_assert(in->first <= straydn->first)
To view current crashes utilize the following command ceph crash ls. To view more information regarding the crash ceph crash info <crashid>. For additional information see Content from docs.ceph.com is not included.Ceph Crash Module
Resolution
- To avoid MDS Corruption, move the folder _deleting to an off-named folder.
Please see diagnostics steps for details. - After the MDS has stabilized, Engineering will be consulted to determine how to resolve the corruption.
Root Cause
When the MDS is removing deleted CSI Volume data, MDS Metadata corruption is encountered causing the MDS pods to be in CrashLoopBackOff, (CLBO).
Diagnostic Steps
- Scale down the Ceph-MGR deployment. Within a few seconds the Ceph-MDS should stabilize as CephFS deletes are processed through the Ceph-MGR daemons.
$ oc scale deployment -l app=rook-ceph-mgr --replicas=0
-
Debug into an OpenShift Worker node and follow Mount the ODF CephFS volume on an OpenShift Worker Node to mount the CephFS Volume.
-
Navigate to the mount point where the CephFS Volume is mounted and list its contents:
sh-5.1# cd /mnt/cephfs/
sh-5.1# ls -l
-rwxr-xr-x. 1 root root 0 Nov 22 10:17 _csi:csi-vol-d9400293-6a4e-11ed-9693-0a580a02181f.meta
-rwxr-xr-x. 1 root root 0 Nov 28 14:20 _csi:csi-vol-d97b6574-6f27-11ed-9693-0a580a02181f.meta
-rwxr-xr-x. 1 root root 0 Jan 19 12:15 _csi:csi-vol-f87eb94a-97f2-11ed-92d4-0a580a021821.meta
-rwxr-xr-x. 1 root root 0 Jan 19 12:15 _csi:csi-vol-fba1de4a-97f2-11ed-92d4-0a580a021821.meta
drwx------. 2 root root 1 Jan 19 12:23 _deleting
drwx------. 3 root root 1 Sep 16 13:25 _index
drwx------. 2 root root 0 Jan 19 09:48 _legacy
drwxr-xr-x. 44 root root 42 Jan 19 13:07 csi
- As a temporary workaround, rename the
_deletingfolder to_deleting.tmp
sh-5.1# mv _deleting _deleting.tmp
- Scale up the Manager deployment
$ oc scale deployment -l app=rook-ceph-mgr --replicas=1
- Verify that the Ceph-MDS pods are in a
Runningstate
$ oc get pods -n openshift-storage -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-69db9946vqcmm 2/2 Running 0 13d
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-7f6bb8c9z5s8k 2/2 Running 0 13d
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.