Failed to push image with error filesystem: mkdir /registry/docker: permission denied in OpenShift 4

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Internal Image Registry
  • NFS

Issue

  • Errors like the following one are being spotted while uploading an image to the internal Image Registry:

    Error: Error copying image to the remote destination: Error writing blob: Error initiating layer upload to /v2/registry/namespace/blobs/uploads/ in default-route-openshift-image-registry.apps.example.com: received unexpected HTTP status: 500 Internal Server Error
    
  • The registry logs shows messages like the following ones:

    level=error msg="error putting into main store: filesystem: mkdir /registry/docker: permission denied" go.version=go1.10.8 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=00000000-0000-0000-0000-000000000000 http.request.method=GET http.request.remoteaddr="10.0.1.1:56789" http.request.uri="/v2/openshift/httpd/manifests/sha256:e67868a558cfe45441e1c326de6d67596fc61b65aa183066fbadfe53c20fb415" http.request.useragent=Go-http-client/1.1 openshift.auth.user="system:serviceaccount:clustervalidation:builder"
    
    level=error msg="response completed with error" err.code=unknown err.detail="filesystem: mkdir /registry/docker: permission denied" err.message="unknown error" go.version="go1.22.5 (Red Hat 1.22.5-1.el9) X:strictfipsruntime" http.request.host=default-route-openshift-image-registry.apps.example.com http.request.id=00000000-0000-0000-0000-000000000000 http.request.method=POST http.request.remoteaddr=10.0.1.1 http.request.uri=/v2/default/httpd/blobs/uploads/ http.request.useragent="containers/5.29.5 (github.com/containers/image)" http.response.contenttype=application/json http.response.duration=15ms http.response.status=500 http.response.written=156 openshift.auth.user="kube:admin" vars.name=default/httpd
    

Resolution

The error message in the registry's log mkdir /registry/docker: permission denied is quite clear and is pointing to a permission problem at storage level.

In order to change the permissions follow this steps:

  • First, get the node where the registry is running using:

    $ oc get pods -o wide -n openshift-image-registry | grep "^image-registry"
    
  • Get the container id using:

    $ oc describe pod <image-registry-pod> -n openshift-image-registry | grep "Container ID:"
    
  • Connect to the node using:

    $ oc debug node/<worker_node>
    
  • Find the process number of the registry:

    # chroot /host bash
    sh-4.4# pstree -p $(ps aux | awk "/e0c9616ca108c9eb4f26fde0ba4fbb25204a5733cacbfc7879bc3d941f3bd396/ {print \$2; exit}")
    conmon(5965)-+-dockerregistry(5979)-+-{dockerregistry}(6163)
                 |                      |-{dockerregistry}(6164)
                 |                      |-{dockerregistry}(6165)
                 |                      |-{dockerregistry}(6166)
                 |                      |-{dockerregistry}(6167)
                 |                      |-{dockerregistry}(6169)
                 |                      |-{dockerregistry}(6170)
                 |                      |-{dockerregistry}(6171)
                 |                      |-{dockerregistry}(6172)
                 |                      |-{dockerregistry}(7087)
                 |                      |-{dockerregistry}(7088)
                 |                      |-{dockerregistry}(15102)
                 |                      |-{dockerregistry}(15103)
                 |                      `-{dockerregistry}(107863)
                 `-{conmon}(5967)
    
  • Take the first occurrence of the dockerregistry process, in this case: 5979 and enter it's namespace like this:

    sh-4.4# nsenter -a -t 5979
    
    sh-4.2# ls -laZd /registry
    drwxr-xr-x. root root system_u:object_r:nfs_t:s0       /registry
    
    sh-4.2# chgrp 1000020000 /registry
    
    sh-4.2# chmod 2770 /registry
    
    sh-4.2# ls -laZd /registry
    drwxrws---. root 1000020000 system_u:object_r:nfs_t:s0       /registry
    
  • It might also be needed to change the SELinux context if the one listed is not correct.

    sh-4.4# chcon system_u:object_r:nfs_t:s0 /registry
    
    sh-4.2# ls -laZd /registry
    drwxr-xr-x. root root system_u:object_r:nfs_t:s0       /registry
    

Root Cause

In this case, the registry never worked before, and look like the NFS share was created with an incompatible set of permissions. Note that using NFS for the internal Image Registry is not recommended due to known issues as explained in is NFS supported for OpenShift cluster internal components in Production?

Diagnostic Steps

A close look at the filesystem using oc rsh <registry_pod_name> will give more light:

$ oc rsh image-registry-69db47b568-kwbsh -n openshift-image-registry
sh-4.2# ls -laZd /registry
drwxr-xr-x. root root system_u:object_r:nfs_t:s0       /registry
sh-4.2# capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=1000020000(1000020000)
gid=0(root)
groups=1000020000(???)

In this example only root has rights to write in this directory, and the user group of the registry pod (1000020000 in this case) has no right to write there.

SBR

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.