Negative effects of the RHEL default logging setup on performance and their mitigations

Updated

This document describes two negative effects of the default logging environment setup in RHEL on system performance and it also outlines recommended mitigations.

Memory and storage
Systemd-journald on RHEL is configured to auto-detect which storage system to use for journal files, "volatile" or "persistent". The /var/log/journal file is not created by default, so the in-memory (tmpfs) /run/log/journal location is used. By default, it is configured to allow up to 4 GB of log data be stored. As a consequence, heavy journal activity can eat up memory to the default of 4 GB for the /run file system. For memory constrained environments, this can be significant.

To limit the amount of in-memory logs stored in /run/log/journal, set the following in /etc/systemd/journald.conf :

    RuntimeMaxUse=500M
    RuntimeMaxFileSize=10M

These settings limit journald to only use, at most, 500 MB instead of the default of 4 GB. This keeps /run from consuming 4 GB or more of memory and significantly lowers the maximum size of the journal files so that smaller amounts of historical data are deleted when rotated.

Rate-limiting
Systemd-journald maintains a separate rate-limit mechanism for its journal entries from that maintained by rsyslog. For systemd-journald, the default per-service rate-limit of 1,000 entries in a 30-second time period is maintained.

If one service hits the limit, other services are not affected. The rsyslog configuration of rate-limiting entries read from the journal (Content from www.rsyslog.com is not included.imjournal module) is 20,000 messages in 600 seconds (10 minutes), which is the same average rate of 33 msgs/sec as systemd-journald. If you change the systemd-journald parameters, be sure to change rsyslog as well.

A good rule of thumb is to ensure the rsyslog rate-limits always accommodates the systemd-journald rate-limits. However, the rsyslog rate-limit is for all entries read from the journal. So even though systemd-journald might allow multiple services to log at the maximum rate, this might cause rsyslog's rate-limits to kick in, ultimately preventing journal data from being written to disk.

For those concerned about capturing sufficient logging details from a given service during bursts, up the allowed logging rates from the default of 1,000 logs in 30 seconds or about 33 msgs/sec.

    RateLimitInterval=1s
    RateLimitBurst=10000

Note that this is applied per-service, so if one service is bursty, other services are unaffected. The default of 30 seconds can be quite a long time following a logging burst, so valuable logs can be lost. Remember to adapt rsyslog limits too.

Category
Components
Article Type