Satellite6 performance issues due to puma default tuning
Environment
Red Hat Satellite 6.9 or newer
Issue
Various performance issues are observed since upgrading Satellite to 6.9. The problems often happen during rush hours (that 6.8 handled smoothly). Typical particular problems include:
- slow responses or timeouts of client requests
- WebUI slow or not accessible
- high CPU usage (esp. of
pumaprocesses) - possibly high memory usage of
pumaprocess(es)
Resolution
- Increase
pumatunables. Be aware of the necessity to restartforemanservice due to This content is not included.this installer bug. A rule of thumb for systems with 8 or more CPUs and sufficient free memory is the following:
satellite-installer --foreman-foreman-service-puma-threads-min=5 \
--foreman-foreman-service-puma-threads-max=5 \
--foreman-foreman-service-puma-workers=16
systemctl restart foreman.service
- For less than 8 CPUs or less than 25G free memory, it is recommended to tune
threads-minandthreads-maxthe same way but increasepuma-workersjust to the number of CPUs:
satellite-installer --foreman-foreman-service-puma-threads-min=5 \
--foreman-foreman-service-puma-threads-max=5 \
--foreman-foreman-service-puma-workers=$(nproc)
systemctl restart foreman.service
- Note that the default values (for Sat6.9.4 or older) are:
satellite-installer --foreman-foreman-service-puma-threads-min=0 \
--foreman-foreman-service-puma-threads-max=16 \
--foreman-foreman-service-puma-workers=2
systemctl restart foreman.service
Caution: the above tuning is known to provide much better performance than the default one. Red Hat is currently assessing various values of puma tuning in various tuning profiles and the above recommendation will more likely be yet improved soon. Once that happen, the KCS will be updated accordingly.
Root Cause
Too few puma workers configured, limiting the throughput of requests to foreman. Based on various performance testing described in Satellite 6.9 Performance Guide, it is worth using a rule of thumb 16 workers for any system with 8 cores or more.
Another aspect affecting the number of workers is available (or rather free) memory. One puma worker can usually consume 1.6G memory, therefore for systems with less than 25G free memory, the number of workers should be lowered from 16.
Also, increasing the minimal number of puma threads from default 0 to 5 led to improved performance without any evident impact to memory or CPU usage. The release of Satellite 6.9.6 includes a fix for this as outlined in This content is not included.bugzilla which is scheduled for release in September/October of 2021.
6.9.6 does also include a fix for a bug that produces a This content is not included."Http 502 Bad Gateway and 503" error on occasion.
Note that there is a This content is not included.request to add such tuning to Satellite installer tuning profiles.
Yet another contributing factor is This content is not included.the configuration/integration issue of Puma into Satellite exhibiting similar symptoms.
Diagnostic Steps
- there are just 2
puma: cluster workerprocesses, and they often exhibit high CPU usage:
$ grep puma ps ### grep inside a sosreport archive
foreman 12727 0.1 0.5 977604 511812 ? Ssl Jul14 1:18 puma 4.3.6 (tcp://127.0.0.1:3000) [foreman]
foreman 14019 39.2 1.0 1619512 1030208 ? Sl Jul14 454:28 puma: cluster worker 0: 12727 [foreman]
foreman 14024 39.5 1.0 1596236 1074012 ? Sl Jul14 457:51 puma: cluster worker 1: 12727 [foreman]
$
/var/log/httpd/foreman-ssl_error_ssl.logfull of timeout errors like:
[Thu Jul 15 02:33:33.952830 2021] [proxy_http:error] [pid 30609] (70007)The timeout specified has expired: [client 1.2.3.4:5678] AH01102: error reading status line from remote server 127.0.0.1:3000
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.