docker containerd error message 'panic: close of nil channel' in RHEL 7
Environment
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux Atomic Host
- docker 1.13.1-53.git774336d.el7 to 1.13.1-68.gitdded712.el7
- docker-latest 1.13.1
Issue
-
Why does docker containerd throws an error message on RHEL 7?
Feb 1 02:07:53 testlabdockerd[3616]: panic: close of nil channel Feb 1 02:07:53 testlabdockerd[3616]: goroutine 34 [running]: Feb 1 02:07:53 testlabdockerd[3616]: panic(0x887ae0, 0xc4203300c0) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/panic.go:531 +0x1cf fp=0xc42030fe88 sp=0xc42030fdf0 Feb 1 02:07:53 testlabdockerd[3616]: runtime.closechan(0x0) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/chan.go:322 +0x343 fp=0xc42030ff10 sp=0xc42030fe88 Feb 1 02:07:53 testlabdockerd[3616]: github.com/docker/containerd/supervisor.(*Supervisor).execExit.func1(0xc420336000, 0xc420202340, 0x0) Feb 1 02:07:53 testlabdockerd[3616]: /builddir/build/BUILD/docker-9a813fad75217ff3a3c1e0c1ecf5a9dd9dfbccf1/_build/src/github.com/docker/containerd/supervisor/exit.go:92 +0x10c fp=0xc42030ffc8 sp=0xc42030ff10 Feb 1 02:07:53 testlabdockerd[3616]: runtime.goexit() Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42030ffd0 sp=0xc42030ffc8 Feb 1 02:07:53 testlabdockerd[3616]: created by github.com/docker/containerd/supervisor.(*Supervisor).execExit Feb 1 02:07:53 testlabdockerd[3616]: /builddir/build/BUILD/docker-9a813fad75217ff3a3c1e0c1ecf5a9dd9dfbccf1/_build/src/github.com/docker/containerd/supervisor/exit.go:93 +0x198 Feb 1 02:07:53 testlabdockerd[3616]: goroutine 1 [chan receive]: Feb 1 02:07:53 testlabdockerd[3616]: runtime.gopark(0x936188, 0xc420091258, 0x91c076, 0xc, 0xc420155817, 0x3) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/proc.go:271 +0x13a fp=0xc4201557d8 sp=0xc4201557a8 Feb 1 02:07:53 testlabdockerd[3616]: runtime.goparkunlock(0xc420091258, 0x91c076, 0xc, 0xc420155817, 0x3) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/proc.go:277 +0x5e fp=0xc420155818 sp=0xc4201557d8 Feb 1 02:07:53 testlabdockerd[3616]: runtime.chanrecv(0x858000, 0xc420091200, 0xc420155a08, 0xc4201eba01, 0xc4201ee250) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/chan.go:513 +0x371 fp=0xc4201558b8 sp=0xc420155818 Feb 1 02:07:53 testlabdockerd[3616]: runtime.chanrecv2(0x858000, 0xc420091200, 0xc420155a08, 0x34) Feb 1 02:07:53 testlabdockerd[3616]: /usr/lib/golang/src/runtime/chan.go:400 +0x35 fp=0xc4201558f0 sp=0xc4201558b8 Feb 1 02:07:53 testlabdockerd[3616]: main.daemon(0xc420119540, 0xd, 0x0) Feb 1 02:07:53 testlabdockerd[3616]: /builddir/build/BUILD/docker-9a813fad75217ff3a3c1e0c1ecf5a9dd9dfbccf1/containerd-89a5d2ce19344c8c8bbfef03b43434f60a4afcc2/containerd/main.go:191 +0x583 fp=0xc420155a88 sp=0xc4201558f0 Feb 1 02:07:53 testlabdockerd[3616]: main.main.func2(0xc420119540) Feb 1 02:07:54 testlabdockerd[3616]: /builddir/build/BUILD/docker-9a813fad75217ff3a3c1e0c1ecf5a9dd9dfbccf1/containerd-89a5d2ce19344c8c8bbfef0: -
Docker went into a hang state with the following message:
Apr 09 14:51:19 testlabdockerd dockerd-current[1541]: time="2018-04-09T14:51:19.229223532-04:00" level=error msg="Handler for POST /v1.26/containers/601e0bf46120440ff58ac5416619b719a647cded17d15f19106dc890e72e732e/start returned error: grpc: the connection is unavailable"
Resolution
Upgrade to docker-1.13.1-74.git6e3bb8e.el7 or newer
Root Cause
An unexpected restart of containerd (docker-containerd-current) can leave the directory structure in /run/docker/libcontainerd/containerd/ in an inconsistent state. The containerd process must have crashed/been killed/restarted right after a docker exec process exited in a container but before containerd updated the /run/docker/libcontainerd/containerd/ directory tree with the new state of the process. When containerd starts again and tries to traverse this tree it will panic. The current workaround is to restart the docker service. Whatever caused containerd to restart in the first place should be resolved, and then this bug would not be triggered.
In almost all instances where this bug is triggered the cause of the containerd restart was an overloaded system. The docker (dockerd-current) daemon communicates with containerd, and, when it feels that containerd is not responding (or in these cases, not responding fast enough), it restarts it. If the restart happens at just the right time this bug is triggered, otherwise the restart is usually not noticed by the sysadmin. Once this bug is resolved the containerd restarts will continue unless the excessive load issue is resolved by the sysadmin.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.