Candlepin timeout when upgrading to Red Hat Satellite 6.8

Solution Unverified - Updated

Environment

  • Red Hat Satellite 6.8

Issue

  • In large environments, the necessary step to upgrade the candlepin database between Satellite 6.7.z and 6.8.z can take too long and exceed the implicit timeout of 300 seconds, resulting in a failed upgrade.

Resolution

This content is not included.Bug 1931508 has been opened to address that issue.

Workaround - In the meantime, Increasing the timeout in the candlepin update can allow time for the upgrade to complete the necessary steps.

  • If the upgrade fails, an explicit timeout can be configured in the /usr/share/foreman-installer/modules/candlepin/manifests/database/postgresql.pp file, once the upgrade has pulled down the updated version of the foreman-installer RPM that provides it. (Shown below with an added explicit timeout added of 3600 seconds, or one hour. Normally that line is absent)
    exec { 'cpdb update':
      path    => '/usr/share/candlepin:/bin',
      command => "cpdb --update \
                       --dbhost=${db_host} \
                       --dbport=${db_port} \
                       --database='${db_name}${ssl_options}' \
                       --user='${db_user}'  \
                       --password='${db_password}' \
                       >> ${log_dir}/cpdb.log \
                       2>&1 && touch /var/lib/candlepin/cpdb_update_done",
      creates => '/var/lib/candlepin/cpdb_update_done',
      timeout => 3600,
    }
  • It's possible that a first failed attempt will leave a lock in the candlepin database which would need to be removed before a subsequent re-run of the upgrade will succeed. Encountering such a lock would look like this:
liquibase.exception.LockException: Could not acquire change log lock.  Currently locked by satellite-hostname  since 2/15/21 3:17 AM
    at liquibase.lockservice.StandardLockService.waitForLock(StandardLockService.java:168)
    at liquibase.Liquibase.update(Liquibase.java:189)
    at liquibase.Liquibase.update(Liquibase.java:181)
    at liquibase.integration.commandline.Main.doMigration(Main.java:880)
    at liquibase.integration.commandline.Main.main(Main.java:133)

If you encounter a lock, please This content is not included.open a Red Hat support case and mention this Knowledgebase solution.

For more KB articles/solutions related to Red Hat Satellite 6.x Candlepin Issues, please refer to the Consolidated Troubleshooting Article for Red Hat Satellite 6.x Candlepin Issues

Root Cause

The steps to update the candlepin database use a puppet exec call, which defaults to 300 seconds if not explicitly set to a different value or disabled by setting it to zero.

If the time to update the candlepin database is sufficiently long, this timeout can be encountered. Encountering this timeout during upgrades between other versions seems theoretically possible, though the time to complete may be more likely in the jump between 6.7 and 6.8.

SBR
Product(s)
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.