[Satellite6] slower performance followed by bigger foreman postgres DB size due to logs table
Environment
Red Hat Satellite 6
OpenSCAP scanning features of Red Hat Satellite 6
Issue
- generic slowness of Satellite/foreman
- foreman postgres DB consuming lot of disk space
- large numbers of rows (millions+) in the Foreman 'logs' table
- optionally,
postgres: foreman foremanprocesses consume a lot of memory - further check identified
logstable in foreman database consuming majority of disk space
Resolution
Utilize the cleanup utility provided by our built in rake task. Satellite automatically deletes some reports after 7 days via a cron:
# cat /etc/cron.d/foreman | grep reports
30 7 * * * foreman /usr/sbin/foreman-rake reports:expire >>/var/log/foreman/cron.log 2>&1
You can specify the both the age of the reports you wish to preserve with the 'days' parameter as well as the type of report. By default this cleanup routine has the following settings:
- Only ConfigReport types are cleaned up automatically as found in the foreman.cron file. ConfigReports are reports coming in from Puppet and Ansible
- It will delete reports older than 7 days.
For users of OpenSCAP, all reports in the database grow without bounds. You need to add an additional report_type parameter to the expire task, which is stored as a ForemanOpenscap::ArfReport.
/usr/sbin/foreman-rake reports:expire days=7 report_type=ForemanOpenscap::ArfReport
NOTE: This will delete all OpenSCAP reports older than 7 days. It may take a considerable amount of time to delete these reports from a Satellite with history of running OpenSCAP scans. For example on a lab Satellite of ours, it took approximately 10 minutes to remove 70,000 rows from the reports table and the associated 4,000,000 records from the logs table via the above expire task. Before issuing an ArfReport cleanup you should record the number of rows by checking with a query:
# su - postgres -c "psql foreman -c \"select to_char(count(*), 'FM9,999,999,999') as logscount from logs;\""
logscount
-----------
5,714,878
(1 row)
# su - postgres -c "psql foreman -c \"select to_char(count(*), 'FM9,999,999,999') as reportscount from reports;\""
reportscount
-----------
75,341
(1 row)
WARNING: We recommend performing a backup or virtual machine snapshot of your Satellite before performing a large cleanup operation. This will ensure that if you clean up more reports than desired you can recover to your previous conditions.
Once you are happy with the age of your reports we recommend you automate this cleanup with a daily cron entry containing the updated parameters as specified above.
You can verify the decrease in reports and log entries via the 2 SQL queries above.
After cleaning large amounts of reports and logs entries we recommend a re-index and vacuum of your Postgres foreman databases:
REINDEX:
We highly recommend stopping Satellite services as a safe wrapper of the reindex. Run satellite-maintain service stop --exclude=postgresql before and satellite-maintain service start after the below commands.
# su - postgres -c "psql foreman -c \"REINDEX DATABASE foreman;\""
NOTICE: table "pg_catalog.pg_class" was reindexed
NOTICE: table "pg_catalog.pg_statistic" was reindexed
NOTICE: table "pg_catalog.pg_type" was reindexed
NOTICE: table "public.ansible_roles" was reindexed
NOTICE: table "public.ar_internal_metadata" was reindexed
NOTICE: table "pg_catalog.pg_authid" was reindexed
NOTICE: table "pg_catalog.pg_proc" was reindexed
NOTICE: table "pg_catalog.pg_user_mapping" was reindexed
...
VACUUM:
VACUUM FULL requires up to 2x the disk space as utilized by /var/lib/pgsql (on RHEL8) or /var/opt/rh/rh-postgresql/lib/pgsql* (on RHEL7) to make a copy and then clean up the space consumed, so ensure you have free space on your partition where the pgsql directory resides.
# su - postgres -c "psql foreman -c \"VACUUM FULL;\""
RESULTS: We have seen speed up operations from performing the above cleanup routines bring the time to perform an OpenSCAP scan on a host down from 2-5 minutes to 6-10 seconds.
Root Cause
logs table in foreman postgres consumed majority of DB size, which have a negative impact to generic performance of Satellite - esp. for processing new reports or searching them.
Diagnostic Steps
logs table having hundreds of thousands of entries:
# su - postgres -c "psql foreman -c \"select to_char(count(*), 'FM9,999,999,999') as count from logs;\""
count
-----------
5,714,878
(1 row)
.. and consuming a lot of space (in sosreport, search for sos_commands/foreman/*db_table_size file or a similar one):
table_name | total | index | toast | table
---------------------------------------+---------+---------+------------+---------
logs | 36 GB | 26 GB | | 9401 MB
audits | 6529 MB | 3222 MB | 2976 kB | 3304 MB
..
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.