Using autofs in Docker containers and the "Too many levels of symbolic links" message

Updated

Problem Description

When using autofs in recent RHEL releases the error Too many levels of symbolic links (an ELOOP error, error number 40) can occur unexpectedly when automounts are attempted.

Cause

This can occur for a number of different reasons but they all have the same underlying cause. It is due to the way the autofs kernel module checks if the last component of a path is a mount point when multiple mount namespaces are in use.

Previously it wasn't possible to check whether the last component of a path was a mount point in the current namespace, only if it was a mount point in any namespace. This, together with cloning of mounts into a new mount namespace whose mounts are "propagation private", leads to autofs incorrectly deciding a mount is already present and results in no callback being made to the automount(8) daemon to perform the mount.

The most common ways this happens is by using the systemd PrivateTmp option in service units or use of unshare(1). Both of these can make use of mount namespaces that are "propagation private" so when changes to mounts occur they are not propagated to mount these "propagation private" namespaces and the check in the autofs kernel module becomes unreliable.

Consequences

When this situation arises, the kernel path lookup will continually retry the mount, finally leading to an ELOOP error message.

This causes the automount to become unusable, it can't be mounted any more.

One example of this behaviour is when an autofs mount has been made and a service that uses systemd PrivateTmp is restarted. The mounted automount is cloned to the PrivateTmp mount namespace and is later expired in the root namespace where autofs is running. At this point, the mount can no longer be mounted in the root namespace because it appears to the kernel to be already mounted and an ELOOP error is returned.

Solution

In order to make autofs more resilient to cases where a mount namespace that includes autofs mounts has been cloned to "propagation private" mount namespace, a namespace-aware mounted check has been implemented. This change has been included in RHEL-7.4 kernel revision 3.10.0-658.el7 (and upstream kernel 4.10) and later.

This doesn't completely resolve the ELOOP problem because there are still ways this error can happen for some usage patterns.

For example, when a Docker container is passed an autofs indirect mount as a Docker volume parameter and the container mounts are "propagation private", an ELOOP error will be returned on automount attempts. In this case the error is the correct response, although not ideal as it isn't a symlink that is being followed, it does reflect that the path lookup is in a loop. That's because the container mounts are "propagation private" and don't receive automount mount changes.

Using autofs within Docker containers

Due to the ELOOP problem using autofs with Docker containers can be a little confusing so here are some examples of what should work.

Using autofs entirely within a Docker container should work, however there are a couple of things that need to be considered:

When autofs is used entirely within a Docker container, the container needs to be run with privilege (i.e. the --privileged option needs to be added to the docker run command line) to perform NFS mounts.

Bind mounting an autofs mount into a container with an independent autofs daemon running can't be done because it may conflict with the autofs daemon running in the originating namespace. This isn't recommended and isn't supported by Red Hat.

Running autofs in the root namespace to provide automounting for Docker containers by binding autofs top level mounts into containers with the Docker volume option should function mostly as expected for indirect mounts.

There are some cases that can cause unexpected behaviour:

The most common problem seen is when the Docker implementation doesn't set the mount propagation to what is needed. For autofs to work sensibly, Docker mount propagation needs to be set to "slave" but this has changed often over time in recent Docker releases. If the Docker implementation doesn't set its mounts "propagation slave" it should be possible to work around it by appending :slave to the Docker volume option parameter (eg: docker run -it --rm -v /test:/test:slave fedora-autofs:v1 bash).

Indirect autofs mounts appear to work as expected in cases that have been checked.

However, if Docker does not set its mounts as "propagation slave", attempting to list the top level directory of an autofs mount in a container results in a permission denied error. This appears to be due to Docker not labelling the mount point properly causing SELinux to prevent the directory listing. The only way to work around this is to run the container with privilege (eg: docker run --privileged -it --rm -v /test:/test:slave fedora-autofs:v1 bash).

However, even without the privilege option, automounting within the top level directory still functions as expected.

Trying to bind mount autofs direct mounts into a Docker container using the Docker volume option can result in unexpected behaviour too.

The bind mounting done by Docker causes the direct mount to be mounted when the container starts and it will stay mounted while the container is running. With the changes included in RHEL-7.4 these direct mounts will expire in the root namespace and can be automounted again on access in the root namespace. Nevertheless, this behaviour is not how direct mounts are meant to be used.

There is also the problem of specifying autofs direct mounts to be used within the container. The Docker volume option is impractical for all but cases where only a few direct mounts are to be used and the volumes-from Docker option doesn't appear to work so a data provider container can't be used for this either.

There are probably more cases that haven't been seen yet, these are the ones that have been investigated so far and kernel changes made that best minimize undesired behaviour without adversely impacting existing behaviour.

Category
Components
Article Type