How to find errata or package artifacts files under /var/lib/pulp on pulp-3 ?

Solution Verified - Updated

Environment

Red Hat Satellite / Capsule 6.10 or newer

Issue

Having some issue with repository metadata or (corrupted or missing) package, one needs to check the content of a repodata file (repomd.xml, updateinfo.xml etc.) or RPM package itself as stored on the disk on Satellite or Capsule. With pulp-2, the files were stored explicitly in /var/lib/pulp/published/yum/master/yum_distributor/REPO/TIMESTAMP/repodata/ (or packages) dir (and also behind symlink / URI path like/var/lib/pulp/published/yum/https/repos/RedHat/Library/content/dist/rhel8/8/x86_64/baseos/os).

pulp-3 does not follow that approach and stores the content under /var/lib/pulp/media/artifact directory. How to identify all or some specified repo metadata file or particular RPM there?

Resolution

To find metadata artifacts:

Find the filepath(s) using pulpcore-manager:

sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' pulpcore-manager shell

First import some modules/libraries by typing:

from pulp_rpm.app.models.repository import RpmDistribution, RpmPublication
from pulpcore.app.models.repository import RepositoryVersion

Now, identify the Publication object by either publication path, or Publication UUID or repository UUID and its version:

pub = RpmDistribution.objects.get(base_path='MyOrg/DEV/cv_rhel8/content/dist/rhel8/8/x86_64/appstream/os').publication
pub = RpmPublication.objects.get(pulp_id='a7996954-79fa-4334-a8a5-9fba0c2560cc')
pub = RepositoryVersion.objects.get(repository='88835194-54a7-4af4-a062-6f572edeec0b', number=1, complete=True).publication_set.first()

To find repomd.xml or updateinfo or modulemd.yaml:

pub.published_metadata.get(relative_path__contains='repomd.xml').contentartifact_set.first().artifact.file.path
pub.published_metadata.get(relative_path__contains='updateinfo.xml').contentartifact_set.first().artifact.file.path
pub.published_metadata.get(relative_path__contains='modules.yaml').contentartifact_set.first().artifact.file.path

To find all metadata files (one needs to press Enter to insert an empty line just after the print):

>>> for pm in pub.published_metadata.all():
...   print("%s : %s" % (pm.relative_path, pm.contentartifact_set.first().artifact.file.path))
... 
repodata/cc167fbe2d28c26330ceff4b050e3f812ebeeac472f348c8ac93177150edee72-primary.xml.gz : /var/lib/pulp/media/artifact/cc/167fbe2d28c26330ceff4b050e3f812ebeeac472f348c8ac93177150edee72
repodata/2459403bb6bb9d79d6eff45e6cdfd37754e2cf8dcad49c28386d1dbdc0cbeb3f-filelists.xml.gz : /var/lib/pulp/media/artifact/24/59403bb6bb9d79d6eff45e6cdfd37754e2cf8dcad49c28386d1dbdc0cbeb3f
repodata/5350dc4ef00e813fb277db0ea2f76f8ee8b67995aad0f8666b9cd6a3435a160c-other.xml.gz : /var/lib/pulp/media/artifact/53/50dc4ef00e813fb277db0ea2f76f8ee8b67995aad0f8666b9cd6a3435a160c
repodata/d3aadb9c09d8de3d49a1bfbcd45111187ad75d6b95f80b1ed36956145fb73dd9-updateinfo.xml.gz : /var/lib/pulp/media/artifact/d3/aadb9c09d8de3d49a1bfbcd45111187ad75d6b95f80b1ed36956145fb73dd9
repodata/4bb82f8df1ac70093f3390ab63d1a68e6a273601f33965273011f2b36d6675c4-modules.yaml : /var/lib/pulp/media/artifact/4b/b82f8df1ac70093f3390ab63d1a68e6a273601f33965273011f2b36d6675c4
repodata/9ef05eb95808a147e5ea40417ac290a061a1ee1c759f67cfb95ccd6a1884448b-comps.xml : /var/lib/pulp/media/artifact/9e/f05eb95808a147e5ea40417ac290a061a1ee1c759f67cfb95ccd6a1884448b
repodata/1bb882be134b87c01a858a5a717c75bc16a2a804d58f5d50d396e2a229eea79f-992c1888-a328-4853-9422-27039fc7ab41 : /var/lib/pulp/media/artifact/1b/b882be134b87c01a858a5a717c75bc16a2a804d58f5d50d396e2a229eea79f
repodata/repomd.xml : /var/lib/pulp/media/artifact/4c/7b63570831c8207c174fe3097e37339a46d08b88d52c3ad523cd745cf363d3
>>> 

To find a published artifact of an RPM:

First, find Publication object pub like before. Then for a given RPM filename, execute:

>>> pub.published_artifact.filter(relative_path__contains='tcpdump-4.9.3-2.el8.x86_64.rpm').first().content_artifact.artifact.file.path
'/var/lib/pulp/media/artifact/40/b60e76df77a025a3391129cc997a536d6fe220a05d636f31507a1bd92f2363'

Or generally find the artifact directly among ContentArtifact objects:

from pulpcore.app.models.content import ContentArtifact
ContentArtifact.objects.get(relative_path__contains='tcpdump-4.9.3-2.el8.x86_64.rpm').artifact.file.path
'/var/lib/pulp/media/artifact/40/b60e76df77a025a3391129cc997a536d6fe220a05d636f31507a1bd92f2363'

To see more examples:

To see e.g. how to prepare a direct SQL query for the same, see Content from hackmd.io is not included.this hackmd.io snippet.

To see more examples of pulpcore-manager usage, see this KCS article.

For more KB articles/solutions related to Red Hat Satellite 6.x Pulp 3.0 Issues, please refer to the Consolidated Troubleshooting Article for Red Hat Satellite 6.x Pulp 3.0-related Issues

SBR
Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.