How to capture vmcore of a OpenStack instance ?
Environment
- Red Hat OpenStack
Issue
- How to capture vmcore of a guest/instance present on OpenStack platform ?
- What data need to be captured of hung OpenStack instance for root cause analysis ?
Resolution
NOTES :
- The size of the captured dump file will be slightly bigger than the total memory of the instance.
- Hence this process requires free disk space on the respective compute host to be more than the total memory of instance of which vmcore is required.
- The instance will freeze during capturing of dump.
- Other instances on the same node might also go into freeze state during dump capturing.
1) Identify the instance name in virsh terminology and host on which affected instance is present of which vmcore need to be collected :
[stack@undercloud-0 ~]$ source stackrc
(undercloud) [stack@undercloud-0 ~]$ openstack server show <UUID_of_instance>
....
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.test.local |
| OS-EXT-SRV-ATTR:instance_name | instance-00000003 |
....
2) Identify the IP address of compute-0 :
(undercloud) [stack@undercloud-0 ~]$ openstack server list | grep compute-0
| 76a5026b-6e57-4dec-b68e-22f8bcd20c48 | compute-0 | ACTIVE | ctlplane=192.168.0.47 | overcloud-full | compute|
3) ssh into the compute node on which instance is present :
(undercloud) [stack@undercloud-0 ~]$ ssh heat-admin@192.168.0.47
....
[heat-admin@compute-0 ~]$ sudo -i
[root@compute-0 ~]#
4) Cross-check once the presence of instance :
For OpenStack version 17 :
# podman exec -it nova_virtqemud virsh list --all | grep -i instance-00000003
For OpenStack version 15-16 :
# podman exec -it nova_libvirt virsh list --all | grep -i instance-00000003
For OpenStack version 11-14 :
# docker exec -it nova_libvirt virsh list --all | grep -i instance-00000003
For OpenStack version 10 and below :
# virsh list --all | grep -i instance-00000003
5) Capture the memory dump of the instance :
For OpenStack version 17 :
# podman exec -it nova_virtqemud virsh dump instance-00000003 /var/log/libvirt/dumpfile --memory-only --verbose
For OpenStack version 15-16 :
# podman exec -it nova_libvirt virsh dump instance-00000003 /var/log/libvirt/dumpfile --memory-only --verbose
For OpenStack version 11-14 :
# docker exec -it nova_libvirt virsh dump instance-00000003 /var/log/libvirt/dumpfile --memory-only --verbose
For OpenStack version 10 and below :
# virsh dump instance-00000003 /tmp/dumpfile --memory-only --verbose
6) Cross-check the dump file on the compute host :
For OpenStack version 11 and above :
# ls -l /var/log/containers/libvirt/
For OpenStack version 10 and below :
# ls -l /tmp/
7) Move the captured dumpfile from the compute host to a location from where it can be provided to Red Hat.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.