The virt-who service is unable to report back a hypervisor which was deleted from Satellite UI in Red Hat Satellite 6
Environment
- Red Hat Satellite 6
Issue
-
A content host is not reflecting under the correct hypervisor in the satellite UI after migration from another hypervisor.
-
After deleting the affected hypervisor entry from satellite, even after successful execution of virt-who, the same hypervisor entry is missing from the satellite UI.
Resolution
-
Go through the Diagnostic Steps section to identify the possible reason for hypervisor data not getting updated in UI.
-
If noticed following message in the
candlepin.logfile, then that existing aync job needs to be taken care of.Unable to queue job: Hypervisor Heartbeat Update; blocked by the following existing jobs: 8af3b8eb794a03aa017a46b4b156554a-
Before proceeding further, ensure that a good VM snapshot or backup of the Satellite server is present.
-
Now, Here in the example, the job id is
8af3b8eb794a03aa017a46b4b156554a. Identify the same from the candlepin database.# su - postgres -c "psql candlepin -c \"select * from cp_async_jobs where id = '8af3b8eb794a03aa017a46b4b156554a' ;"\""" id | created | updated | version | name | job_key | job_group | origin | executor | principal | owner_id | correlation_id | previous_state | state | attempts | max_attempts | start_time | end_time | log_level | log_ex ecution_details | job_result ----------------------------------+----------------------------+----------------------------+---------+-----------------------------+------------------------------+-----------+-------------------- +----------+---------------+----------------------------------+--------------------------------------+----------------+-------+----------+--------------+------------+----------+-----------+------- ----------------+------------ 8af3b8eb794a03aa017a46b4b156554a | 2021-06-25 22:04:44.886-07 | 2021-06-25 22:04:44.889-07 | 1 | Hypervisor Heartbeat Update | HypervisorHeartbeatUpdateJob | | satellite6.xxx.com | | foreman_admin | 8af3b8eb675257990167525829780001 | ba7f57fa-7ff5-4c98-a115-6effc47c3a00 | 0 | 3 | 0 | 1 | | | | t | (1 row) -
Delete that job manually from the candlepin database.
# systemctl stop virt-who tomcat # su - postgres -c "psql candlepin" DELETE from cp_async_jobs where id = '8af3b8eb794a03aa017a46b4b156554a' ; select * from cp_async_jobs where id = '8af3b8eb794a03aa017a46b4b156554a' ; exit # systemctl start tomcat # sleep 20 && hammer ping # systemctl restart virt-who -
Check back in
Satellite UI --> Hosts --> All Hostspage and observe if the missing hypervisor has been reported now. -
To avoid this issue for occurring again, please verify the size of the file
/var/lib/candlepin/candlepin-crl.crlin the Red Hat Satellite server.- If the file size is in GBs then, apply the workaround as mentioned in Bugzilla This content is not included.1927532.
-
- This issue has been reported to the Engineering team and is being tracked via Bugzilla This content is not included.1999089. In case of further concerns, reach out to This content is not included.Red Hat Technical Support for further assistance with the investigation.
For more KB articles/solutions related to Virt-who and Virtual Datacenter (VDC) Subscriptions Issues, please refer to the Consolidated Troubleshooting Article for Virt-who and Virtual Datacenter (VDC) Subscriptions Issues
Diagnostic Steps
-
Identify if virt-who is actually reporting the hypervisor name in the Host-Guest mapping or not.
# systemctl stop virt-who # virt-who -p 2>/dev/null | python -m json.tool | awk -F':' '/"uuid"/{print $NF}'| tr -d '",' -
Navigate to
Monitor --> Tasksin the Satellite UI and ensure that there is a latest Hypervisors task created and completed successfully. -
Open the dynflow console of the latest Hypervisors task, go to Run tab and check for the missing hypervisor after expanding the Actions::Katello::Host::HypervisorsUpdate step.
-
If so far everything looks fine, check for any unexpected messages about the hypervisor update related task\jobs.
# grep -i hypervisor /var/log/candlepin/candlepin.log 2021-07-07 12:34:22,611 [thread=http-bio-127.0.0.1-8443-exec-599] [req=72a1d5f2-3293-4ce1-87f0-b2d1bd846d66, org=, csid=6b288fa3-90b9-41c1-82d4-159cd801a431] INFO org.candlepin.common.filter.LoggingFilter - Request: verb=PUT, uri=/candlepin/hypervisors/VSP/heartbeat?reporter_id=satellite6.xxx.com-3952a16142f84bb596b8a230e26fdcbe 2021-07-07 12:34:22,622 [thread=http-bio-127.0.0.1-8443-exec-599] [req=72a1d5f2-3293-4ce1-87f0-b2d1bd846d66, org=VSP, csid=6b288fa3-90b9-41c1-82d4-159cd801a431] INFO org.candlepin.async.JobManager - Unable to queue job: Hypervisor Heartbeat Update; blocked by the following existing jobs: 8af3b8eb794a03aa017a46b4b156554a
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.