[Satellite 6] Satellite services unavailable after a failed offline backup

Solution Verified - Updated

Environment

  • Red Hat Satellite 6.x

Issue

  • After a failed offline backup of the Satellite server, Satellite services (including the web UI) are unavailable.
    The backup was launched by the following command:

    # foreman-maintain backup offline --assumeyes /backups
    

Resolution

Workaround:


To bring back the Satellite services after the failed backup, the following steps need to be performed on the Satellite server:
  1. Start Satellite services:
# foreman-maintain service start
  1. Verify that all Satellite services have been started:
# foreman-maintain service status
  1. Disable maintenance mode:
# foreman-maintain maintenance-mode stop
  1. Verify that the maintenance mode has been disabled:
# foreman-maintain maintenance-mode status

For more KB articles/solutions related to Red Hat Satellite 6.x backup issues, please refer to the Consolidated Troubleshooting Article for Red Hat Satellite 6.x backup-related Issues

Root Cause

  • The workflow for Red Hat Satellite offline backup is as follows:

    1. Enable maintenance mode.
    2. Stop all Satellite services.
    3. Copy config files and DB data.
    4. Start all Satellite services back.
    5. Disable maintenance mode.
  • The issue addressed by this solution arises when Satellite offline backup was interrupted, or when the backup fails due to out of disk space on the destination file system.

Diagnostic Steps

  • In the /var/log/foreman-maintain/foreman-maintain.log:

    I, [2021-02-13 19:15:43+0100 #2997]  INFO -- : --- Execution step 'Add maintenance_mode chain to iptables' finished ---       <==== Enable maintenance mode
    I, [2021-02-13 19:15:43+0100 #2997]  INFO -- : --- Execution step 'Stop applicable services' [service-stop] started ---       <==== stop all services
    .
    .
    .
    D, [2021-02-13 19:16:47+0100 #2997] DEBUG -- : Invoking tar from /var/lib/pulp
    D, [2021-02-13 19:16:47+0100 #2997] DEBUG -- : Running command tar --selinux --create --file=/backups/satellite-backup-2021-02-13-19-15-03/pulp_data.tar --exclude=var/lib/pulp/katello-export --listed-incremental=/backups/satellite-backup-2021-02-13-19-15-03/.pulp.snar --transform 's,^,var/lib/pulp/,S' -S * with stdin nil
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : output of the command:
    I, [2021-02-13 20:08:06+0100 #2997]  INFO -- : --- Execution step 'Backup Pulp data' finished ---
    I, [2021-02-13 20:08:06+0100 #2997]  INFO -- : --- Execution step 'Backup mongo offline' [backup-offline-mongo] started ---
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : Running command hostname -f with stdin nil
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : output of the command:
    I, [2021-02-13 20:08:06+0100 #2997]  INFO -- : Backup of Mongo DB at /var/lib/mongodb into /backups/satellite-backup-2021-02-13-19-15-03/mongo_data.tar
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : {:listed_incremental=>"/backups/satellite-backup-2021-02-13-19-15-03/.mongo.snar", :volume_size=>nil, :data_dir=>nil}
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : Invoking tar from /var/lib/mongodb
    D, [2021-02-13 20:08:06+0100 #2997] DEBUG -- : Running command tar --selinux --create --file=/backups/satellite-backup-2021-02-13-19-15-03/mongo_data.tar --exclude=mongod.lock --listed-incremental=/backups/satellite-backup-2021-02-13-19-15-03/.mongo.snar --transform 's,^,var/lib/mongodb/,S' -S * with stdin nil
    D, [2021-02-13 20:08:11+0100 #2997] DEBUG -- : output of the command:
    E, [2021-02-13 20:08:11+0100 #2997] ERROR -- : Failed executing tar --selinux --create --file=/backups/satellite-backup-2021-02-13-19-15-03/mongo_data.tar --exclude=mongod.lock --listed-incremental=/backups/satellite-backup-2021-02-13-19-15-03/.mongo.snar --transform 's,^,var/lib/mongodb/,S' -S *, exit status 2:                        <==== error creating the tarball with mongodb data
    I, [2021-02-13 20:08:11+0100 #2997]  INFO -- : --- Execution step 'Backup mongo offline' finished ---
    I, [2021-02-13 20:08:11+0100 #2997]  INFO -- : === Scenario 'Backup' finished ===
    D, [2021-02-13 20:08:11+0100 #2997] DEBUG -- : === Rescue scenario found. Executing ===
    E, [2021-02-13 20:08:14+0100 #2997] ERROR -- : The runner is already in quit state (RuntimeError)        <==== Runtime error which simply abort foreman-maintain
    I, [2021-02-13 20:08:14+0100 #2997]  INFO -- : foreman-maintain command finished with 1                  <==== foreman-maintain exits and return 1 (fail) without restarting the services nor disabling maintenance mode
    
SBR
Product(s)

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.