Mitigation of directory traversal attack in the Python tarfile library (CVE-2007-4559)

Updated

1. Background information

Code that uses Python's tarfile module unsafely in Red Hat Enterprise Linux (RHEL) is vulnerable to CVE-2007-4559, a directory traversal attack through using the tarfile.extract and tarfile.extractall functions.

When extracting a tar archive, the Python tarfile module previously used the information in the archive as-is, allowing, for example, extraction outside the destination directory using absolute paths or paths with /../ components.

Python documentation warns against extracting untrusted archives, but this warning not sufficient to prevent vulnerable code.

2. Upstream resolution

Mitigation of CVE-2007-4559 for Red Hat systems is based on the upstream resolution described in Content from peps.python.org is not included.PEP 706. This makes the resolution compatible with planned future versions of all distributions of Python (provided by Red Hat or by third parties), and thus compatible with the wider ecosystem of Python libraries. However, Red Hat may introduce the mitigation sooner than other vendors, and backport it to older versions of Python. Current versions of third-party libraries and applications may be incompatible and will need updates or changes in configuration.

New versions of Python add a filter argument to the tarfile extraction functions. The argument allows turning tar features off for increased safety (including blocking the CVE-2007-4559 directory traversal attack).

The filter argument can take the following values:

  • 'fully_trusted'- to get the previous behavior of following the tar archive literally
  • 'tar'- to block common exploits by:
    • stripping initial slashes (/) from path names
    • failing to extract files outside the destination directory
    • clearing group/other write permissions and SUID, SGID, sticky mode bits
  • 'data' - to block more *nix-specific features. In addition to 'tar' this will:
    • fail to extract links (symbolic or hard) to absolute paths
    • fail to extract links pointing outside the destination directory
    • fail to extract device files and pipes
    • set permissions to the default for directories, and for other files:
      • set owner read and write permissions
      • clear executable permission for group and other if it is not set for the owner
      • clear write permission for group and other if it is not set for the owner
    • ignore user and group information (extracting as the current user)
  • A custom function - see Content from peps.python.org is not included.PEP 706 for more information

2.1 Upstream resolution timeline

In Python 3.12 and 3.13, calling extraction functions without the new argument will emit a deprecation warning.

Starting with Python 3.14, the safer 'data' filter will become the default.

Security updates for older versions (possibly 3.7 through 3.11) will backport the filter argument, but not the deprecation warning or the default behavior change.

3. Red Hat Enterprise Linux resolution

In RHEL, the following versions of Python will:

  • Add the filter argument described above
  • Make the 'tar' filter and runtime warning the default behavior
  • Provide additional configuration options described below.
ProductPython versionChanged in
RHEL 8python36RHSA-2023:7151
RHEL 8python38RHSA-2023:7050
RHEL 8python39RHSA-2023:7034
RHEL 8python3.11RHSA-2023:7024
RHEL 9python3.9RHSA-2023:6659
RHEL 9python3.11RHSA-2023:6494

3.1 Warning when your application has been affected

In the updated Python packages, the tarfile.Tarfile.extractall and tarfile.Tarfile.extract methods issue the following warning by default:

The default behavior of tarfile has been changed
to disallow common exploits (including CVE-2007-4559).
By default, absolute/parent paths are disallowed and
some mode bits are cleared. See https://access.redhat.com/articles/7004769
for more details.

The shutil.unpack_archive function will emit the same warning when used on a tar archive.

To suppress the warning, use one of the following approaches:

3.2 Specifying the behavior when calling the affected extraction functions

With the updated Python packages, when calling the tarfile.Tarfile.extractall, tarfile.Tarfile.extract, and shutil.unpack_archive functions/methods, you can explicitly set the filter parameter to one of the following strings described in the upstream resolution section: 'fully_trusted', 'tar', and 'data'.

For example:

import tarfile

tarfile.TarFile("./user-uploaded-archive.tar").extractall("/srv/uploads/1234", filter='data')

tarfile.TarFile("./trusted-system-backup.tar").extractall("/mnt/provision_root/", filter='fully_trusted')

3.3 Configuring the default behavior for existing applications

For cases where it is not feasible to modify application code, Python packages where the default behavior has been changed provide several ways to configure the default and to suppress the warning:

All of these approaches enable you to choose from the three options described the upstream resolution section: 'fully_trusted', 'tar', and 'data'.

NOTE: When Red Hat releases Python 3.12 or later, only configuration in Python will be available. The configuration file and environment variable approaches will be available only for Python versions where the default behavior has been changed up to Python 3.11.

3.3.1. Configuration file

Create the /etc/python/tarfile.cfg file with the following content. Note that you might have to create the /etc/python/ directory first.

[tarfile]

PYTHON_TARFILE_EXTRACTION_FILTER=data

This will globally configure all the affected Python versions on your system to the selected choice: data (shown in the example), tar, or fully_trusted (see the upstream resolution for details about each value).

3.3.2. Environment variable

You can configure the filter using an environment variable, for example:

PYTHON_TARFILE_EXTRACTION_FILTER='data' python your_application.py

Alternatively, you can export the variable so that it is visible to any new process in your environment:

export PYTHON_TARFILE_EXTRACTION_FILTER='data'
...
python your_application.py

You can use any of the three options: 'data' (shown in the example), 'tar', or 'fully_trusted' (see the upstream resolution for details about each value). The environment variable takes priority over the configuration file.

3.3.3. Configuration in Python

You can configure the default filter either in the code of your Python application or globally in the sitecustomize.py file in your site-packages directory. Include this code:

import tarfile

tarfile.TarFile.extraction_filter = staticmethod(tarfile.data_filter)

You can use any of the three options: tarfile.data_filter (shown in the example), tarfile.tar_filter, or tarfile.fully_trusted_filter (see the upstream resolution for details about each value). Configuration in Python takes priority over both the environment variable and the configuration file.

IMPORTANT The tarfile.data_filter attribute is available only in the updated versions of Python. With other Python versions, the code will fail. You can write similar code in a backwards-compatible way, see Content from peps.python.org is not included.PEP 706 for examples.

4. Fedora resolution

Fedora will follow the upstream resolution.

Category
Article Type