How to tune log rate-limiting in Red Hat Enterprise Linux 7 and later version?

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9
  • systemd-journald
  • rsyslog

Issue

  • We want to log all messages to syslog, what should I add/change in rsyslog.conf config to avoid losing messages?

  • Why am I getting the below messages in my RHEL server?

    Apr  7 11:50:35 localhost rsyslogd-2177: imjournal: begin to drop messages due to rate-limiting
    Apr  7 11:50:47 localhost rsyslogd-2177: imjournal: 406 messages lost due to rate-limiting
    Apr  7 12:00:38 localhost rsyslogd-2177: imjournal: begin to drop messages due to rate-limiting
    Apr  7 12:00:48 localhost rsyslogd-2177: imjournal: 391 messages lost due to rate-limiting
    Apr  7 12:19:21 localhost rsyslogd-2177: imjournal: begin to drop messages due to rate-limiting
    
  • How do I disable rate limiting using rsyslog on my RHEL server?

  • I disabled rate limiting in rsyslog.conf but I'm still missing messages. I see warnings like this in the journal:

    Oct 28 02:07:57 localhost systemd-journal[1864]: Suppressed 97 messages from /
    
  • I increased rate limiting in /etc/systemd/journald.conf but I'm still missing messages. I see warnings like this in the journal:

    Oct 28 02:07:57 localhost systemd-journal[1864]: /dev/kmsg buffer overrun, some messages lost.
    

Resolution

In RHEL system there are multiple locations that can affect and reduce event logging within /var/log/messages. One of the following three different messages can be displayed when logged events are either rate limited or lost:

  • systemd-journal[PID]: Suppressed N messages ...
    • Adjust settings within /etc/systemd/journald.conf and restart journald. See steps 1 and 2 below.
  • imjournal: begin to drop messages due to rate-limiting
    • Adjust settings within /etc/rsyslog.conf and restart rsyslog. See steps 3 and 4 below.
  • systemd-journal[PID]: /dev/kmsg buffer overrun, some messages lost.
    • Adjust kernel log buffer length to so it can hold additional messages in its fifo. See step 5.

Typically all three of the locations will need to be tuned in order to allow large amounts of logged events -- for example if turning on all I/O extended logging within the SCSI portion of the kernel's I/O stack or within a driver.

  1. Modify the RateLimitInterval (RHEL7), RateLimitIntervalSec (RHEL 8, 9) and/or RateLimitBurst directives in /etc/systemd/journald.conf
    From the journald.conf man page of RHEL 7:

    RateLimitInterval=, RateLimitBurst=
        Configures the rate limiting that is applied to all messages generated on the system. If,
        in the time interval defined by RateLimitInterval=, more messages than specified in
        RateLimitBurst= are logged by a service, all further messages within the interval are
        dropped until the interval is over. A message about the number of dropped messages is
        generated. This rate limiting is applied per-service, so that two services which log do
        not interfere with each other's limits. Defaults to 1000 messages in 30s. The time
        specification for RateLimitInterval= may be specified in the following units: "s", "min",
        "h", "ms", "us". To turn off any kind of rate limiting, set either value to 0.
     
    • Lowering the interval from the 30s default (to e.g., 15s) will allow more messages through

    • Raising the burst from the default of 1000 (to e.g., 3000) will allow more messages through

    • Disabling rate-limiting altogether is not recommended

  2. Restart systemd-journald

    # systemctl restart systemd-journald
    
  3. If rsyslogd is reporting imjournal messages like: "imjournal: begin to drop messages due to rate-limiting", modify $imjournalRatelimitInterval and/or $imjournalRatelimitBurst in /etc/rsyslog.conf
    From /usr/share/doc/rsyslog-doc-7.4.7/imjournal.html provided by the rsyslog-doc package:

    $imjournalRatelimitInterval (legacy directive) equivalent to:
     ratelimit.interval seconds (default: 600)
     Specifies the interval in seconds onto which rate-limiting is to be applied. If more than
     ratelimit.burst messages are read during that interval, further messages up to the end of
     the interval are discarded. The number of messages discarded is emitted at the end of the
     interval (if there were any discards). Setting this to value zero turns off ratelimiting.
     Note that it is not recommended to turn of ratelimiting, except that you know for sure
     journal database entries will never be corrupted. Without ratelimiting, a corrupted systemd
     journal database may cause a kind of denial of service (we are stressing this point as
     multiple users have reported us such problems with the journal database - information
     current as of June 2013).
     
     $imjournalRatelimitBurst (legacy directive) equivalent to:
     ratelimit.burst messages (default: 20000)
     Specifies the maximum number of messages that can be emitted within the ratelimit.interval
     interval. For futher information, see description there.
     
    • Lowering the interval from the 600 second default (to e.g., 300 secs) will allow more messages through

    • Raising the burst from the default of 20000 (to e.g., 30000) will allow more messages through

    • As with systemd-journald, disabling rate-limiting altogether is not recommended. Note the following:
      Warning: Some versions of journald have problems with database corruption, which leads to the journal returning the same data endlessly in a tight loop. This can result in a large number of log messages getting duplicated inside the logs managed by rsyslog. It can also lead to denial-of-service if this results in 100% CPU or memory usage.

    • There are two ways to specify these module directives, however you can use only one of the configuration syntax. If you chose to use new-style, make sure to comment options related to legacy syntax.

      1. Assuming rsyslog.conf hasn't been modified, simply use the legacy configuration directives
        Example excerpt:

        # File to store the position in the journal
        $IMJournalStateFile imjournal.state
        $imjournalRatelimitInterval 300
        $imjournalRatelimitBurst 30000
        
      2. Otherwise if rsyslog.conf has been modified to use new-style module-loading syntax, well, stick with that
        Example excerpt:

        module(load="imjournal" StateFile="/var/lib/rsyslog/imjournal.state" ratelimit.interval="300" ratelimit.burst="30000")
        
  4. Restart rsyslog

    # systemctl restart rsyslog
    
  5. Increase the kernel's internal message fifo size, add the following to the boot line and reboot. These will help eliminate /dev/kmsg buffer overrun from happening. When other rate limiting is eliminated or reduced then additional space is often needed within the kernel's message FIFO to hold the additional events before they are ultimately output.

    log_buf_len=64M
    
  6. [RHEL 8 & 9 Only] If the messages are being written into /dev/kmsg from userspace applications, then the following kernel attribute may need to be changed from the ratelimit default value to on (no rate limit applied). This change can be made on the boot line or via the sysctl interface.

    printk.devkmsg=on 
    

Diagnostic Steps

For testing as root ...

  1. Execute:

    # tail -F /var/log/messages
    
  2. From a second shell, execute:

    # journalctl -f
    
  3. From a third shell, execute:

    # seq 1 3000 | logger
    
  4. Raise/lower the second number in the seq command as needed for testing

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.