Crio panics with "panic: close of closed channel " after attempting to stop a container in OpenShift Container Platform 4
Environment
- Red Hat OpenShift Container Platform (OCP) 4
Issue
-
criopanics withpanic: close of closed channeland the below stacktrace:Mar 14 15:23:27 worker-1 hyperkube[1692]: I0314 15:23:27.387723 1692 kubelet.go:1954] "SyncLoop REMOVE" source="api" pods=[debug/worker-1-debug] Mar 14 15:23:27 worker-1 hyperkube[1692]: I0314 15:23:27.387792 1692 kubelet_pods.go:1285] "Killing unwanted pod" podName="worker-1-debug" Mar 14 15:23:27 worker-1 hyperkube[1692]: I0314 15:23:27.387838 1692 kuberuntime_container.go:720] "Killing container with a grace period override" pod="debug/worker-1-debug" podUID=ed3fba93-12fd-40ba-af22-207bc2dd7ebd containerName="container-00" containerID="cri-o://5e4a5efe282d3f77fe472d8810fa9a8a61df545a6087a7e8ecaa9379b7f1fa5c" gracePeriod=2 Mar 14 15:23:27 worker-1 systemd[1]: crio-conmon-5e4a5efe282d3f77fe472d8810fa9a8a61df545a6087a7e8ecaa9379b7f1fa5c.scope: Consumed 55ms CPU time Mar 14 15:23:27 worker-1 crio[1636]: panic: close of closed channel Mar 14 15:23:27 worker-1 crio[1636]: goroutine 5778599 [running]: Mar 14 15:23:27 worker-1 crio[1636]: panic(0x55c2b280a280, 0x55c2b2aa4f90) Mar 14 15:23:27 worker-1 crio[1636]: /usr/lib/golang/src/runtime/panic.go:1065 +0x565 fp=0xc001827530 sp=0xc001827468 pc=0x55c2b098e8a5 Mar 14 15:23:27 worker-1 crio[1636]: runtime.closechan(0xc00187e300) Mar 14 15:23:27 worker-1 crio[1636]: /usr/lib/golang/src/runtime/chan.go:363 +0x3f5 fp=0xc001827570 sp=0xc001827530 pc=0x55c2b095cbb5 Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/internal/oci.(*runtimeOCI).StopContainer.func1(0xc001827678, 0xc0015e5080) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/internal/oci/runtime_oci.go:701 +0x49 fp=0xc001827588 sp=0xc001827570 pc=0x55c2b1fd0a29 Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/internal/oci.(*runtimeOCI).StopContainer(0xc001ad3530, 0x55c2b2b19a20, 0xc001a90c00, 0xc0015e5080, 0x2, 0x55c2b2ac7d68, 0xc0000da780) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/internal/oci/runtime_oci.go:710 +0x788 fp=0xc001827650 sp=0xc001827588 pc=0x55c2b1fc3948 Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/internal/oci.(*Runtime).StopContainer(0xc0005fe5d0, 0x55c2b2b19a20, 0xc001a90c00, 0xc0015e5080, 0x2, 0x0, 0x8000101) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/internal/oci/oci.go:323 +0x9d fp=0xc001827698 sp=0xc001827650 pc=0x55c2b1fba85d Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/internal/lib.(*ContainerServer).StopContainer(0xc0006be180, 0x55c2b2b19a20, 0xc001a90c00, 0xc0015e5080, 0x2, 0xc000daebd0, 0x0) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/internal/lib/stop.go:14 +0x79 fp=0xc001827748 sp=0xc001827698 pc=0x55c2b2009159 Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/server.(*Server).StopContainer(0xc0001ba580, 0x55c2b2b19a20, 0xc001a90c00, 0xc001827840, 0x55c2b0962325, 0x55c2b2977e00) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/server/container_stop.go:34 +0x349 fp=0xc001827810 sp=0xc001827748 pc=0x55c2b2076b69 Mar 14 15:23:27 worker-1 crio[1636]: github.com/cri-o/cri-o/server/cri/v1alpha2.(*service).StopContainer(0xc00059c080, 0x55c2b2b19a20, 0xc001a90c00, 0xc00195c220, 0xc00059c080, 0x1, 0x1) Mar 14 15:23:27 worker-1 crio[1636]: /builddir/build/BUILD/cri-o-c05847896bc721f6529b1ceb4bafaf6cfe523b5d/_output/src/github.com/cri-o/cri-o/server/cri/v1alpha2/rpc_stop_container.go:17 +0x85 fp=0xc001827868 sp=0xc001827810 pc=0x55c2b2184625
Resolution
- This problem was resolved in Red Hat OpenShift Container Platform 4.9.31 via RHBA-2022:1605 and Red Hat OpenShift Container Platform 4.10.11 via RHBA-2022:1431. Please update to the given version or later to prevent the issue from happening.
- As consequence of this Bug
corednsand/orkeepalivedstatic Pods may be inPendingstate. Please refer to coredns and keepalived Pods in a non-ready state in RHOCP 4 to obtain more details and information.
Root Cause
crio would segfault when receiving multiple stop requests for the same container.
SBR
Product(s)
Components
Category
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.