Mitigation for CVE-2023-2088 on OSP16 & OSP17

Solution Verified - Updated

Environment

OpenStack deployments using iSCSI or FC transport protocols for the volumes may be affected by this issue, other protocols such as RBD/Ceph, NFS, and NVMe-oF are not affected.

Not all storage systems using iSCSI and FC will be affected, as it depends on the Cinder driver and Storage array specific behavior. For example, it doesn't affect iSCSI Cinder drivers that use a per-volume target instead of a per-host target (what we call shared-target), and there are also storage arrays that send a Power-on or device reset Unit Attention event that triggers actions on the Linux kernel side that prevent the issue from happening (this signal has been seen in a HPE 3PAR FC system).

Issue

An unauthorized access to a volume could occur when an iSCSI using shared targets or an FC connection from a host is severed due to a volume being unmapped on the storage system and the device is later reused for another volume on the same host.

This data leak can be triggered by two different situations as reported in CVE-2023-2088

Accidental case
If there is a problem with network connectivity during a normal detach operation, OpenStack may fail to clean the connection up properly. Instead of force-detaching the compute node device, Nova ignores the error, since the instance has already been deleted. Due to this incomplete operation OpenStack may end up selecting the wrong multipath device when connecting another volume to an instance.

Intentional case
A regular user can create an instance with a volume, and then delete the volume attachment directly in Cinder, which neglects to notify Nova. The compute node SCSI plumbing (over iSCSI/FC) will continue trying to connect to the original host/port/LUN, not knowing the attachment has been deleted. If a subsequent volume attachment re-uses the host/port/LUN for a different instance and volume, the original instance will gain access to it once the SCSI plumbing reconnects.

Resolution

Updating an OpenStack deployment may take a long time requiring a proper maintenance window and may even require a validation process of the release prior to the deployment, so operators may prefer to apply tactical configuration changes to their cloud to prevent harmful actions while they go through their standardized process.

This short-term mitigation process will also be useful while the official Cinder fix for the attack case is available in the CDN. Refer to the Red Hat CVE page to find out more about the release status of the fix.

The mitigation to protect OSP16 and OSP17 deployments from the intentional attack case is to use a policy to restrict operations that normal users can run, though we recommend to update to a OSP release with the Cinder fixes whenever it's possible.

Note: Glance is only affected by this bug when using Cinder as a Glance Backend and with the standard OSP configuration the quick fix proposed here works, but it won’t if it’s changed to use the user token instead of a dedicated project to store the images.

We recommend customers engage with Red Hat to apply this solution, as these are rough instructions

These are the instructions:

  1. Install the files from this article on the director node, and give execution privilege to the shell scripts:

    • ensure-service-roles.sh
    • generate-cve-policy-overrides.sh
  2. Execute the ensure-service-roles.sh script

  3. Execute the generate-cve-policy-overrides.sh, which generates a new cve-policy-overrides.yaml file.
    NOTE: Pay particular attention to the note the script displays regarding situations where any existing policy overrides need to be merged into the cve-policy-overrides.yaml file.

  4. Re-execute the overcloud deployment command, with the cve-policy-overrides.yaml environment file appended.

    (undercloud) $ openstack overcloud deploy --templates …\
    -e <existing deployment environment files> \
    -e cve-policy-overrides.yaml
    

For those wanting a better understanding of what the above steps do, here is a brief description of what is being automated via Director:

  • Ensure that the nova user has the service role, which should have already been configured by default.

  • Ensure that the glance user has the service role, which may have already been configured.

  • Deploy the Cinder policy rules to only allow services to do attachment deletion, connection termination, and detach operations and to prevent anyone from using force_detach (to prevent human operators from unintentionally shooting themselves on the foot). The automated steps above fill in the < nova_service_uuid >

    "is_service": "role:service or service_user_id:<nova_service_uuid>"
    "volume:attachment_delete": "rule:admin_or_owner and rule:is_service"
    "volume_extension:volume_actions:terminate_connection": "rule:admin_or_owner and rule:is_service"
    "volume_extension:volume_actions:detach": "rule:admin_or_owner and rule:is_service"
    "volume_extension:volume_admin_actions:force_detach": "!"
    

Attention

  • Limitations: This short-term mitigation is not as fine-grained as the code changes in Cinder, as it doesn't have any way to distinguish between dangerous and safe requests. That is why the recommendation is to update the OSP version which will not only add a improved fix against the attack but will also improve the protection against accidental attacks.
  • These policy changes should be rolled back after when full fix is being applied to benefit from the finer-grained cinder fix.
  • There are no hotfixes planned for OSP releases, a new release will be available for OSP16.1, OSP16.2, and OSP17.x containing the fixes.

Code changes are necessary to prevent the accidental case in Computes and Controller nodes, please refer to the Red Hat CVE page to see when these are available on the CDN.

Root Cause

The underlying issue is the way SCSI based transport protocols work on Linux and how devices under /dev/ are not automatically removed when the connection to the storage array LUN is severed or the volumes are unexported/unmapped.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.