Fluentd is dropping logs -- Tuning Fluentd
Environment
- Red Hat OpenShift Container Platform
- 3.X
Issue
- Elasticsearch is not receiving all the logs from our pods
- I checked the Kibana dashboard and some entries are missing
- Fluentd appears to be losing messages targeted for Kibana
- We have a need for a very large volume of application log to be logged in Kibana in a short period of time
- Only a portion of the messages are arriving to Elasticsearch/Kibana
Resolution
- Check that Fluentd has enough resources allocated.
# oc edit daemonset logging-fluentd
. . .
resources:
limits:
cpu: 800m
memory: 1Gi
. . .
-
Fluentd can use up to
1000mCPU (it's a single-threaded process so cannot use more than one vCPU) -
Memory might also be a bottleneck, especially if increasing the CPU does not fix the issue
-
However usually if you are running low on memory the fluentd process will actually get oom-killed
-
Check if the process is being restarted with:
# oc exec $FLUENTPOD -- ps -ef
-
-
If the "PID" for fluentd is in the hundreds or thousands after a couple of hours or after a few stress tests, the process is probably restarting frequently, implying that there is not enough memory allocated to the pods.
-
Update to the latest logging images to ensure all performance improvements are in place in the cluster. See the This page is not included, but the link has been rewritten to point to the nearest parent document.documentation for steps.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.