rafthttp: the clock difference against peer is too high in OpenShift etcd
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- etcd
Issue
-
Messages like the following ones are shown in etcd logs:
W | rafthttp: the clock difference against peer xxxxxxxxxxxxxxxx is too high [4m18.466926704s > 1s] W | rafthttp: the clock difference against peer xxxxxxxxxxxxxxxx is too high [4m18.463381838s > 1s] -
The liveness probes for the etcd pods fails.
Resolution
It is possible to use NTP servers to get the clocks for all the nodes syncrhonized, as explained in configuring NTP/chrony in Openshift 4.
If Chrony is configured but not properly synchronizing, refer to chronyd is not synchronizing with NTP server and troubleshooting the NTP Chrony time service in Red Hat OpenShift Container Platform.
Root Cause
There is a time difference between the master nodes causing the issue.
Diagnostic Steps
-
Check logs of the
etcdpods:$ oc logs etcd-master1.ocp.example.com -n openshift-etcd -c etcd | grep "the clock difference against peer" 2021-09-24T06:39:16.408674158Z 2021-09-24 06:39:16.408617 W | rafthttp: the clock difference against peer xxxxxxxxxxxxxxxx is too high [4m18.466926704s > 1s] 2021-09-24T06:39:16.465279570Z 2021-09-24 06:39:16.465225 W | rafthttp: the clock difference against peer xxxxxxxxxxxxxxxx is too high [4m18.463381838s > 1s] -
Check the time on each master node:
$ for NODE in $(oc get nodes -l node-role.kubernetes.io/control-plane= -o name); do echo "-------------- $NODE ------------"; oc debug -q ${NODE} -- chroot /host bash -c "hostname; echo; timedatectl"; echo; done -------------- node/master-0.openshift.example.com ------------ master-0.openshift.example.com Local time: Fri 2021-09-24 13:55:21 UTC Universal time: Fri 2021-09-24 13:55:21 UTC RTC time: Fri 2021-09-24 13:55:38 Time zone: UTC (UTC, +0000) System clock synchronized: no NTP service: active RTC in local TZ: no -------------- node/master-1.openshift.example.com ------------ master-1.openshift.example.com Local time: Fri 2021-09-24 13:55:43 UTC Universal time: Fri 2021-09-24 13:55:43 UTC RTC time: Fri 2021-09-24 13:56:01 Time zone: UTC (UTC, +0000) System clock synchronized: no NTP service: active RTC in local TZ: no -------------- node/master-2.openshift.example.com ------------ master-2.openshift.example.com Local time: Fri 2021-09-24 13:52:04 UTC ### high time difference with other nodes Universal time: Fri 2021-09-24 13:52:04 UTC ### high time difference with other nodes RTC time: Fri 2021-09-24 13:56:39 Time zone: UTC (UTC, +0000) System clock synchronized: no NTP service: active RTC in local TZ: noIf
oc debug nodeis not working, try with SSH via the IPs of the control plane nodes:$ oc get nodes -l node-role.kubernetes.io/control-plane= -o wide [...] $ export MASTER_IP='<IP-1> <IP-2> <IP-3> ' $ for IP in $MASTER_IP; do ssh core@$IP -o "StrictHostKeyChecking=no" -C bash -c "hostname; echo; timedatectl"; echo; done
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.