8.1 Release Notes

Red Hat Ceph Storage 8

Release notes for Red Hat Ceph Storage 8.1

Red Hat Ceph Storage Documentation Team

Abstract

The release notes describes the major features, enhancements, known issues, and bug fixes implemented for the Red Hat Ceph Storage 8.1 product release.
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright's message.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.

Providing feedback on Red Hat Ceph Storage documentation

We appreciate your feedback on our documentation. To help us improve, click Share Feedback, complete the form, and submit it.

Chapter 1. Introduction

Red Hat Ceph Storage is a massively scalable, open, software-defined storage platform that combines the most stable version of the Ceph storage system with a Ceph management platform, deployment utilities, and support services.

The Red Hat Ceph Storage documentation is available at https://docs.redhat.com/en/documentation/red_hat_ceph_storage/8.

Chapter 2. Acknowledgments

Red Hat Ceph Storage version 8.0 contains many contributions from the Red Hat Ceph Storage team. In addition, the Ceph project is seeing amazing growth in the quality and quantity of contributions from individuals and organizations in the Ceph community. We would like to thank all members of the Red Hat Ceph Storage team, all of the individual contributors in the Ceph community, and additionally, but not limited to, the contributions from organizations such as:

  • Intel®
  • Fujitsu ®
  • UnitedStack
  • Yahoo ™
  • Ubuntu Kylin
  • Mellanox ®
  • CERN ™
  • Deutsche Telekom
  • Mirantis ®
  • SanDisk ™
  • SUSE ®

Chapter 3. New features

This section lists all major updates, enhancements, and new features introduced in this release of Red Hat Ceph Storage.

3.1. The Cephadm utility

Added automation for the Ceph Object Gateway multi-site setup

With this enhancement, zone group host names can now be set using the Ceph Object Gateway realm bootstrap command. Set the zonegroups_hostnames by using the specification file that is provided to the ceph rgw realm bootstrap command.

This feature continues to add another setup option through the initial specification file that is passed to the bootstrap command, instead of requiring additional steps.

Add the zonegroup_hostnames section to the spec section of the Ceph Object Gateway specification that is passed to the realm bootstrap command. When the section is added, Cephadm automatically adds these specified host names to the zone group that is defined in the specification after the Ceph Object Gateway module finishes creating the palm, zone group, or zone.

The following provides an example of the zonegroup_hostnames section to be added to the specification file:

zonegroup_hostnames:
- host1
- host2
Note

Adding the zone group host names can take a few minutes, depending on other Cephadm module workload activity at the time of compeltion.

This content is not included.Bugzilla:2241321

New automatic application of updated SSL certificates during Ceph rgw service updates

Previously, when updating SSL certificates for Ceph Object Gateway in the service specification, the changes did not take effect until the daemons were manually restarted. This manual step hindered automation and could leave services temporarily running with outdated certificates.

With this enhancement, SSL certificate updates in the Ceph Object Gateway specification automatically trigger the necessary daemon restarts as part of the service update process. As a result, the feature helps ensure that new certificates are applied immediately and improves automation and operational reliability.

This content is not included.Bugzilla:2344352

New ceph orch device replace HOST DEVICE_PATH command to simplify OSD device replacement

Previously, replacing a shared DB device was tedious and error-prone. Cephadm also often redeployed OSDs too quickly after destruction, before the physical device was replaced.

With this enhancement, users can now safely replace devices without race conditions or manual cleanup steps.

This content is not included.Bugzilla:2256116

Improved core dump handling in cephadm systemd units

Previously, core dumps were not generated or were truncated when services crashed, especially in hard-to-reproduce cases, resulting in the loss of valuable debugging information.

With this enhancement, cephadm now sets LimitCORE=infinity in its systemd unit file template and configures the ProcessSizeMax and ExternalSizeMax settings for coredumpctl, provided that the mgr/cephadm/set_coredump_overrides setting is enabled. The maximum size for core dumps is controlled by the mgr/cephadm/coredump_max_size setting. As a result, services now generate complete core dumps, improving the ability to debug crash issues.

This content is not included.Bugzilla:2303745

New custom log rotate configurations available for Cephadm to deploy to each host

With this enhancement, users can now set custom logrotate configurations for both the rotation of cephadm.log and daemons logs that cephadm will deploy to each host.

ceph orch write-custom-logrotate TYPE -i LOGROTATE_FILE

Replace TYPE with either cephadm or cluster, depending on whether you are overwriting the logrotate file for the cluster logs or the cephadm.log. Replace LOGROTATE_FILE to the contents of that logrotate file you want written out.

Note

Start from an existing logrotate config deployed by cephadm and then edit it from there.

The following is the default cephadm.log logrotate configuration file:

# created by cephadm
/var/log/ceph/cephadm.log {
    rotate 7
    daily
    compress
    missingok
    notifempty
    su root root
}

The following is an example of the cluster logrotate configuration file:

# created by cephadm
/var/log/ceph/eb082d44-4225-11f0-9e4b-525400eee38f/*.log {
    rotate 7
    daily
    compress
    sharedscripts
    postrotate
        killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror tcmu-runner || pkill -1 -x 'ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw|rbd-mirror|cephfs-mirror|tcmu-runner' || true
    endscript
    missingok
    notifempty
    su root root

Both cephadm and cluster files can be found on a host in the cluster at /etc/logrotate.d/cephadm and /etc/logrotate.d/ceph-FSID.

Note

If either of these files have been previously edited, the edited version may still exist, and cephadm will not automatically overwrite these configuration files. To overwrite these files, use the ceph orch write-custom-logrotate command.

Cephadm can regenerate the default configurations by removing them and running and triggering a redeploy of daemon on that host. For example, for host1 that has the crash.host1 daemon deployed there, you could run the following command:

ceph orch daemon redeploy crash.host1

In this example, if the two logrotate configs were not present, cephadm will write them out with the current Ceph version default.

This content is not included.Bugzilla:2090881

New support for topographical labeling on hosts

This enhancement expands cephadm’s capabilities by introducing topological key/value properties for hosts. Administrators can now group hosts by meaningful, configurable labels, enabling more efficient rolling upgrades. Instead of issuing multiple commands for each service group (for example, distinct RGW services by rack), upgrades can iterate through a list of topographical labels—streamlining multi-rack operations. Additionally, these new properties open the door for enhanced RADOS read affinity by leveraging improved CRUSH location settings.

This content is not included.Bugzilla:2353013

3.2. Ceph Metrics

New metric allows quick detection of Ceph daemon problems

This enhancement provides the new ceph_daemon_socket_up metric for each Ceph daemon running in the same host as the ceph exporter. The ceph_daemon_socket_up metric provides the health status of a Ceph daemon based on its ability to respond through the admin socket, where a value of 1 indicates a healthy state and 0 indicates an unhealthy state. The metric serves as a tool for quickly detecting problems in any of the main Ceph daemon.

Note

This metric does not provide indicators for the ceph mgr and ceph exporter daemons.

This content is not included.Bugzilla:2146728

3.3. Ceph Dashboard

New bucket shard count displayed

Previously, shard counts were not displayed, limiting visibility into bucket configurations.

With this enhancement, the user can see the number of shards for every bucket in the Object > Buckets list.

This content is not included.Bugzilla:2129325

Ceph Dashboard now supports managing Storage Classes through the UI

Previously, users could not configure or manage Storage Classes through the Dashboard. Although Life Cycle (LC) policies introduced in 8.0 allowed data tiering between Storage Classes, the UI lacked the ability to define or manage the classes themselves.

With this enhancement, users can configure and manage Storage Classes, including cloud-S3 class types, directly from the Dashboard. The enhancement also introduces templates for easier setup of common storage class configurations.

This content is not included.Bugzilla:2350291

KMIP is now added to the list of KMS providers under the Objects > Configuration section of the Dashboard

Previously, the Ceph dashboard supported only KMS providers for managing encryption keys.

With this enhancement, KMIP is now added to the list of KMS providers under the Objects > Configuration section of the dashboard. The dashboard now supports both vault and KMIP as the KMS providers for managing encryption keys. providers for managing encryption keys.

This content is not included.Bugzilla:2305658

Ceph Dashboard now requires users to type the resource name to confirm deletion of critical resources

Previously, users could delete one or more critical resources (such as images, snapshots, subvolumes, subvolume groups, pools, hosts, OSDs, buckets, and file systems) by simply selecting a checkbox. This made accidental deletions more likely.

With this enhancement, the Dashboard prompts users to manually type the resource name in a confirmation textbox before deletion. Additionally, users can now delete only one critical resource at a time, reducing the risk of unintentional data loss.

This content is not included.Bugzilla:2350295

3.4. Ceph File System

cephfs-mirror daemon only transfers changed blocks in a file

Previously, cephfs-mirror daemon would transfer whole files, which is inefficient for large files.

With this enhancement, the cephfs-mirror daemon uses the blockdiff API in the MDS to only transfer changed blocks in a file. As a result, sync performance is significantly improved in some circumstances, especially for large files.

This content is not included.Bugzilla:2317735

Metadata and data pool names can now be used for creating the volume

With this enhancement, the ceph fs volume create command allows users to pass metadata and data pool names to be used for creating the volume. If either is not passed or if either is a non-empty pool, the command stops.

This content is not included.Bugzilla:2355686

CephFS now supports hierarchical case-insensitive or normalized directory entry naming

With this enhancement, CephFS now supports performant case-insensitive file access protocols. As a result, CephFS performance is competitive with other case-insensitive native file systems.

This content is not included.Bugzilla:2350186

FSCrypt encryption is now supported within user space CephFS

With this enhancement, FSCrypt encryption is supported, allowing other software stacks to enable encryption. As a result, encryption can now be enabled and used within CephFS.

This content is not included.Bugzilla:2358435

New support for retrieving the path of a subvolume snapshot

With this enhancement, users can now obtain the path of a subvolume snapshot. Get the path of a subvolume snapshot, by using the new ceph fs subvolume snapshot getpath command. NOTE: If the snapshot does not exist, the command returns an ENOENT error.

This content is not included.Bugzilla:2354017

New support for disabling always-on manager modules and plugins

This enhancement allows administrators to force-disable always-on modules and plugins in the Ceph MGR. Force disabling can help prevent flooding by module commands when the corresponding Ceph service is down or degraded.

This content is not included.Bugzilla:2280032

quota.max_bytes is now set in more understandable size values

Previously, the quota.max_bytes value was set in bytes, resulting in often very large size values, which were hard to set or change.

With this enhancement, the quota.max_bytes values can now be set with human-friendly values, such as M/Mi, G/Gi, or T/Ti. For example, 10GiB or 100K.

This content is not included.Bugzilla:2345288

3.5. Ceph Volume

New support for TPM 2.0 for encrypted OSDs

With this enhancement, users can now enroll a Trusted Platform Module (TPM) 2.0 token during OSD preparation to store Linux Unified Key Setup (LUKS) securely. As a result, key management is now improved by leveraging hardware-backed security.

This content is not included.Bugzilla:2304317

Improved stability for DB partitions

With this enhancement, users can create a dedicated DB partition, even on a colocated OSD deployment scenario. Isolating the RocksDB helps improve stability and prevents fragmentation-related issues.

This content is not included.Bugzilla:2319755

3.6. Ceph Object Gateway

Sites can now configure Ceph Object Gateway error handling for existing bucket creation

Previously, Ceph Object Gateway (RGW) returned a success response when creating a bucket that already existed in the same zone, even if no new bucket was created. This caused confusion in automated workflows.

With this enhancement, sites can now configure RGW to return an error instead of success when attempting to create a bucket that already exists in the zone.

If the configuration option rgw_bucket_exist_override is set to true, RGW returns a 409 BucketAlreadyExists error for duplicate bucket creation requests. By default, this option is set to false.

This content is not included.Bugzilla:2336983

New cloud restore support for Glacier/Tape endpoints to retrieve objects

This enhancement introduces the new cloud-glacier-s3 tier-type to extend S3 endpoint support for Glacier/Tape.

For more information, see Policy Based Data Archival and Retrieval to S3 compatible platforms.

This content is not included.Bugzilla:2358617, This content is not included.Bugzilla:2345486

Dynamic bucket resharding now has the ability to reduce the number of shards

When a bucket undergoes a reduction in the number of objects contained within for an extended period of time, the number of shards should be reduced automatically.

With this enhancement, over time the number of bucket index shards for a bucket will better correspond to the number of objects in the bucket.

This content is not included.Bugzilla:2135354

New support for restoration of versioned objects transitioned to Cloud

With this enhancement, versioned objects can now be restored from the Cloud back into the Ceph Object Gateway cluster.

For more information, see Restoring objects from S3 cloud-tier storage.

This content is not included.Bugzilla:2312931

Creation dates are now added as part of user keys

With this enhancement, when keys are added to a user, a creation stamp is now attached to it. As a result, keys are removed in the proper order when credentials are rotated.

This content is not included.Bugzilla:2316598

HeadBucket requests are now less resource intensive

Previously, all HeadBucket requests required querying all the shards to assemble statistics, which made the requests resource intensive operations.

With this enhancement, the HeadBucket API now reports the X-RGW-Bytes-Used and X-RGW-Object-Count headers only when the read-stats query string is explicitly included in the API request. As a result, HeadBucket requests are now less resource intensive but results received, when specified.

This content is not included.Bugzilla:2325408

A clientID can now be removed from an OpenID Connect provider registered with Ceph Object Gateway

Previously, a clientID could be added to an OpenID Connect provider, but removal was not supported.

With this enhancement, a REST API was added to remove an existing clientID from an OpenID Connect provider.

This content is not included.Bugzilla:2322664

Administrators can now delete bucket index entries with a missing head object

Previously, using a radosgw-admin object rm command would not remove a bucket index entry with a head object missing. Instead of removing the bucket, an error message would be emitted.

With this enhancement, bucket index entries with a missing head object can now be removed with the ` --yes-i-really-mean-it` flag.

This content is not included.Bugzilla:2341761

AssumeRoleWithIdentity now supports validating JWT signatures

Previously, AssumeRoleWithWebIdenity supported JSON Web Token (JWT) signature validation using only x5c.

With this enhancement, AssumeRoleWithIdentity validates JWT signatures by using a JSON Web Key (JWK) with modulus and exponent (n+e). As a result, an OpenID Connect (OIDC) IdP issuing JWK with n+e can now integrate with Ceph Object Gateway.

This content is not included.Bugzilla:2346769

Cloud-transitioned objects can now be restored to a selected storage class

Previously, objects transitioned to cloud were restored only to STANDARD storage class. This was a limitation and can affect data usage of the cluster.

With this enhancement, the new tier-config restore-storage-class option is introduced. Administrators can now choose the data pool to which the objects need to be restored to, providing more flexibility.

For more information, see Restoring objects from S3 cloud-tier storage.

This content is not included.Bugzilla:2345488

New support for PUT bucket notifications from other tenant users

With this enhancement, there is added support for cross tenant topic management, allowing PUT bucket notifications from other tenant users. Cross tenant management includes creating, deleting, and modifying topic management.

This content is not included.Bugzilla:2238814

Support for user accounts through Identity and Access Management (IAM)

User accounts through IAM was previously available as limited release. This enhancement provides full availability for new and existing customers in production environments.

With this release, Ceph Object Gateway supports user accounts as an optional feature to enable the self-service management of users, groups, and roles similar to those in AWS Identity and Access Management (IAM).

For more information, see Identity and Access Management (IAM).

3.7. RADOS

Ceph now optimizes OMAP listing at the OSD level.

OMAP listing at the Ceph OSD is optimized.

This content is not included.Bugzilla:2307146

PG scrub performance improved by removing unnecessary object ID repair check.

Previously, every PG scrub triggered the repair_oinfo_oid() function, which addressed rare object ID mismatches caused by a historical filesystem bug. This added overhead, even when the conditions didn’t apply.

This content is not included.Bugzilla:2356515

pg-upmap-primary mappings can now be removed from the OSDmap

With this enhancement, the new ceph osd rm-pg-upmap-primary-all command is introduced. The command allows users to clear all pg-upmap-primary mappings in the OSDmap at any time.

Use the command to remove pg-upmap-primary with a single command. The command can also be used to remove any invalid mappings, when required.

Important

Use the command carefully, as it directly modifies primary PG mappings and can impact read performance.

This content is not included.Bugzilla:2349077

Cluster log level verbosity for external entities can now be controlled

Previously, debug verbosity logs were sent to all external logging systems regardless of their level settings. As a result, the /var/ filesystem would rapidly fill up.

With this enhancement, the new mon_cluster_log_level command option is introduced and the previous mon_cluster_log_file_level and mon_cluster_log_to_syslog_level command options have been removed.

Important

From this release, use only the new generic mon_cluster_log_level command option to control the cluster log level verbosity for the cluster log file and all external entities.

This content is not included.Bugzilla:2320860

Ceph now reports BlueStore fragmentation through the health warning subsystem

Previously, Ceph only logged BlueStore fragmentation issues in low-visibility log entries, making them easy to overlook.

With this enhancement, Ceph surfaces fragmentation issues directly in the health status, enabling faster detection and easier troubleshooting.

This content is not included.Bugzilla:2350214

Advance notifications are now provided on free fragmentation disk space

Previously, when free space on the disk was significantly fragmented, the searching for free space took longer and potentially impacted performance. While this did not immediately cause problems, impact only emerged at a very late stage, free disk space was very low.

With this enhancement, the disk allocator is queried for current fragmentation, by using the config.bluestore_fragmentation_check_period option. The default check period is every 3600 seconds (1 hour). The fragmentation value is then emitted to the respective OSD log, on level 0. If the value exceeds the free fragmentation level, config.bluestore_warn_on_free_fragmentation with the default value of 0.8, a health warning for the OSD is emitted.

As a result, fragmentation disk space is not at risk, as warnings are emitted with advance notice. For more information, see Health messages of a Ceph cluster.

For more information, see Health messages of a Ceph cluster.

New support for 2-site stretch cluster (stretch-mode)

This enhancement enables a two-site stretch cluster deployment, allowing users to extend Ceph’s failure domain from the OSD level to the data-center or zone level. In this configuration, OSDs and Monitors can be deployed across two data sites, while a third site (monitor-only) acts as a tie-breaker for MON quorum during site failure. This architecture enhances fault tolerance by enabling automatic failover, preventing split-brain scenarios, and supporting recovery to ensure continued cluster availability and data integrity, even during a full-site outage.

Reduced fast storage requirements with RocksDB compression enabled

With this enhancement, when RocksDB compression is enabled, Ceph Object Gateway has a reduced block.db reserved size. The new reduced requirement is changed from 4% to 2.5% of reserved space. The RocksDB compression is enabled by default.

3.8. RBD Mirroring

RBD now supports mirroring between default and non-default namespaces.

With this enhancement, Ceph Block Device introduces a new init-only mode for the rbd mirror pool enable command. This command provides the ability to configure a pool for mirroring and disable mirroring on the default namespace. However, mirroring can still be configured for other namespaces. This feature allows a non-default namespace in the pool to be mirrored to the default namespace in a pool of the same name in the remote cluster.

This content is not included.Bugzilla:2327267

New consistency group snapshot mirroring (CGSM)

Previously, disaster recovery relied on single-image mirroring between clusters. This approach supported isolated images but did not meet the needs of applications that depend on multiple volumes. For example, in a libvirt VM with several disks, each disk serves a different role. Restoring all volumes to a consistent, same-point-in-time state was challenging.

With this enhancement, consistency group mirroring in snapshot mode is now available. CGSM mirrors a group of images or volumes as a consistent set, ensuring data uniformity during recovery. The feature introduces various operations, including enabling, disabling, promoting, demoting, resyncing, snapshotting, and scheduling, which support more robust relocation, failover, and failback processes.

This content is not included.Bugzilla:2089305

Chapter 4. Bug fixes

This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.

4.1. The Cephadm utility

The haproxy daemon no longer fails deployment when using the haproxy_qat_support setting in the ingress specification

Previously, the haproxy_qat_support was present but not functional in the ingress specification. This was added to allow haproxy to offload encryption operations on machines with QAT hardware, intending to improve performance. The added function did not work as intended, due to an incomplete code update. If the haproxy_qat_support setting was used, then the haproxy daemon failed to deploy.

With this fix, the haproxy_qat_support setting works as intended and does not fail the haproxy daemon during deployment.

This content is not included.Bugzilla:2308344

PROMETHEUS_API_HOST gets set during Cephadm Prometheus deployment

Previously, PROMETHEUS_API_HOST would not always get set when Cephadm initially deployed Prometheus. This issue was seen most commonly when bootstrapping a cluster with --skip-monitoring-stack, then deploying Prometheus at a later time. As a result monitoring information could be unavailable.

With this fix, PROMETHEUS_API_HOST gets set during Cephadm Prometheus deployment and monitoring information is available, as expected.

This content is not included.Bugzilla:2315072

4.2. Ceph Dashboard

Creating a sub-user from the dashboard now works as expected

Previously, a Python coding error prevented the dashboard and API from creating sub-users.

With this fix, the corrected code allows the dashboard and API to create sub-users successfully.

This content is not included.Bugzilla:2325221

Editing RGW-related configurations from the dashboard is now supported

Previously, the dashboard relied on incorrect flag data to determine whether a configuration was editable, preventing users from modifying RGW-related settings that were editable via the CLI.

With this fix, the dashboard now allows editing all configurations starting with rgw, ensuring consistency with CLI capabilities.

This content is not included.Bugzilla:2308641

4.3. Ceph File System

Invalid headers during a no longer cause a segmentation fault during journal import

Previously, the cephfs-journal-tool did not check for headers during a journal import operation. This would cause a segmentation fault.

With this fix, headers are checked when running the journal import command and segmentation faults no longer occur with missing headers.

This content is not included.Bugzilla:2303640

cephfs-data-scan during disaster recovery now completes as expected

Previously, in some cases, the cephfs-data-scan ran during disaster recovery but did not create a missing directory fragment from the backtraces or create duplicate links. As a result, directories were inaccessible or crashed the MDS.

With this fix, cephfs-data-scan now properly recreates missing directory fragments and corrects the duplicate links, as expected.

This content is not included.Bugzilla:2343968

Inode invalidation operations now complete faster

Previously, an extra reference to an inode was taken in some cases that was never released. As a result, operations requiring inode invalidation were delayed until a timeout elapsed, making them very slow.

With this fix, the extra reference is avoided, allowing these operations to complete much faster without unnecessary delays.

This content is not included.Bugzilla:2355691

Space larger than the NFS export disk size can no longer be allocated

Previously, an empty file could be created without storage blocks allocated. These empty files could cause write operations to fail when writing to the desired file region, with a command such as fallocate.

With this fix, the fallocate command fails on the NFS mount point with an "Operation not supported" error and no empty files are created without storage blocks allocated.

This content is not included.Bugzilla:2301434

Proxy daemon logs now update immediately

Previously, log messages from the proxy daemon were buffered by the glibc library, causing delays in log file updates. As a result, in the event of a crash, some log entries could be lost, making troubleshooting and debugging more difficult.

With this fix, messages are now written directly to the log file, bypassing glibc buffering, ensuring that logs are immediately visible.

This content is not included.Bugzilla:2357488

Async write deadlock fixed under OSD full conditions

Previously, when asynchronous writes were ongoing and the OSD became full, the client received a notification to cancel the writes. The cancellation method and the callback invoked after the write was canceled both attempted to acquire the same lock. As a result, this led to a deadlock, causing the client to hang indefinitely during an OSD full scenario.

With this fix, the deadlock in the client code has been resolved. Consequently, asynchronous writes during an OSD full scenario no longer cause the client to hang.

This content is not included.Bugzilla:2291163

Expanded removexattr support for CephFS virtual extended attributes

Previously, removexattr was not supported on all appropriate Ceph virtual extended attributes, resulting in attempts to remove an extended attribute failing with a "No such attribute" error.

With this fix, support for removexattr has been expanded to cover all pertinent CephFS virtual extended attributes. You can now properly use removexattr to remove attributes. You can also remove the layout on the root inode. Removing the layout restores the configuration to the default layout.

This content is not included.Bugzilla:2297166

MDS and FS IDs are now verified during health warning checks for fail commands

Previously, the MDS and FS IDs were not checked when executing the ceph mds fail and ceph fs fail commands. As a result, these commands would fail with a "permission denied" error for healthy MDS or FS instances when another instance in the cluster exhibited health warnings.

With this fix, the system now validates the MDS and FS IDs during the health warning check. This change ensures that the ceph mds fail and ceph fs fail commands succeed for healthy instances, even if other MDS or FS instances in the cluster have health warnings.

This content is not included.Bugzilla:2328008

Error mapping now displays specific error message

Previously, an incorrect mapping of the error code to the user message resulted in a generic message being displayed. As a result, users did not see the specific details of the error encountered.

With this fix, the mapping has been corrected to show an error-specific message, ensuring that users receive detailed feedback for the error.

This content is not included.Bugzilla:2359598

fscrypt now decrypts long file names

Previously, the alternate name, which holds the raw encrypted version of the file name, was not provided in all decryption cases. As a result, long file names were not being decrypted correctly, and incomplete directory entry data was produced.

With this fix, the alternate name is provided during decryption, so fscrypt can now decrypt long file names properly.

This content is not included.Bugzilla:2362278

Snapshot names are now stored in plain text

Previously, snapshots could be created regardless of whether the fscrypt key was present. When a snapshot was created using the mgr subvolume snapshot create command without the key, the snapshot name was not encrypted during creation. As a result, subsequent attempts to decrypt the plain text name produced unreadable output.

With this fix, snapshot names are stored as plain text without encryption. This change helps ensure that snapshot names remain readable, whether the fscrypt key is present or not.

This content is not included.Bugzilla:2362859

4.4. Ceph Object Gateway

Multipart uploads using AWS CLI no longer cause Ceph Object Gateway to crash

Previously, during a multipart upload using AWS CLI, RGW crashed due to checksum algorithms and reporting behavior introduced in AWS S3 and AWS SDKs, specifically the new CRC64NVME checksum algorithm.

With this fix, Ceph Object Gateway safely handles the unknown checksum string. As a result, AWS CLI no longer causes multipart uploads to crash Ceph Object Gateways.

This content is not included.Bugzilla:2352427

Deleted objects no longer appear in the bucket listing

Previously, a race between CompleteMultipart and AbortMultipart uploads could lead to inconsistent results. As a result, objects could appear in the bucket listing, even when they were no longer present.

With this fix, a serializer is now used in AbortMultipart uploads and properly deleted objects no longer appear in a bucket listing.

This content is not included.Bugzilla:2331908

Ceph Object Gateway no longer crashes during object deletion

Previously, in some cases, an uninitialized check_objv parameter variable could lead to accessing an invalid memory address in the object delete path. As a result, there was a segmentation fault.

With this fix, the check_objv parameter is always initialized and the object deletion completes as expected.

This content is not included.Bugzilla:2350607

Tail objects no longer wrongly deleted with copy-object

Previously, there was a reference count invariant on tail objects that was not maintained when an object was copied to itself. This caused the existing object was changed, rather than copied. As a result, references to tail objects were being decremented. When the refcount on tail objects dropped to 0, they were deleted during the next garbage collection (GC) cycle.

With this fix, the refcount on tail objects is no longer decremented when completing a copy-to-self.

This content is not included.Bugzilla:2356678

AssumeRoleWithWebIdentity operations now fails as expected when incorrect thumbprints are added

Previously, due to a boolean flag being incorrectly set in the code, the AssumeRoleWithWebIdentity operation succeeded even when an incorrect thumbprint was registered in the CreateOIDCProvider call. As a result, AssumeRoleWithWebIdentity was able to succeed when it should have failed.

With this fix, the boolean flag is not set when no correct thumbprints are found registered in the CreateOIDCProvider call. As a result, if the end user does not provide a correct thumbprint in the CreateOIDCProvider call, the AssumeRoleWithWebIdentity operation now fails as expected.

This content is not included.Bugzilla:2324227

Ceph Object Gateway can now delete objects when RADOS is at maximum pool capacity

Previously, when a RADOS pool was near its maximum quota, the Ceph Object Gateway was not able to delete objects.

With this fix, Ceph Object Gateway can delete objects even when RADOS has reached its maximum pool threshold.

This content is not included.Bugzilla:2342928

User Put Object permissions are now recognized on copied buckets

Previously, bucket policies of a copy source bucket with access permissions for Put Object were not recognized on the copied bucket. As a result, when accessing the copied bucket, an Access Denied error was emitted.

With this fix, copy source bucket policies are loaded during permission evaluation of Put Object and user access on the copied bucket are recognized, as expected.

This content is not included.Bugzilla:2348708

Large queries on Parquet objects no longer emit an Out of memory error

Previously, in some cases, when a query was processed on a Parquet object, that object was read in large chunks. This caused the Ceph Object Gateway to load a larger buffer into the memory, which was too big for low-end machines. The memory would especially be affected when Ceph Object Gateway was co-located with OSD processes, which consume a large amount of memory. With the Out of memory error, the OS killed the Ceph Object Gateway process.

With this fix, the there is an updated limit for the reader-buffer size for reading column chunks. The default size is now 16 MB and the size can be changed through the Ceph Object Gateway configuration file.

This content is not included.Bugzilla:2365146

radosgw-admin no longer crashes by non-positive values

Previously, when running a radosgw-admin bucket reshard command, using a non-positive --num-shards value, such as a zero or a negative number, would cause radosgw-admin to crash.

With this fix, the --num-shards value is checked an error message is emitted if a non-positive value is provided. As a result, radosgw-admin reshard commands run as expected, and are not able to create a crash.

This content is not included.Bugzilla:2312578

Ceph Object Gateway no longer fails during signature validation

Previously, if the JSON Web Token (JWT) was not signed using the first x5c certification for signature validation, the signature validation fails.

With this fix, the correct certificate is chosen for signature validation, even if is not the first certification. As a result, the signature validation completes as expected.

This content is not included.Bugzilla:2242261

Objects are now removed as per the lifecycle rules set when bucket versioning is suspended

Previously, due to an error in the lifecycle code, the lifecycle process did not remove the objects if the bucket versioning was in the suspended state. As a result, the objects were still seen in the bucket listing.

With this fix, the lifecycle code is fixed and now the lifecycle process removes objects as per the rules set and the objects are no longer listed in the bucket listing.

This content is not included.Bugzilla:2319199

Multipart uploads can now add object tags

Previously, the Ceph Object Gateway S3 multipart upload object tags were not recognized when sent by the client. As a result, clients were not able to successfully apply object tags during initial object creation during a multipart upload.

With this fix, object tags are collected and stored. As a result, object tags can now be added and are recognized during multipart uploads.

This content is not included.Bugzilla:2323604

STS implementation now supports encryption keys larger than 1024 bytes

Previously, Ceph Object Gateway STS implementation did not support encryption keys larger than 1024 bytes.

With this fix, encryption keys larger than 1024 bytes are supported, as expected.

This content is not included.Bugzilla:2237854

Bucket logging configurations no longer allow setting the same source and target buckets

Previously, there was no check in place when setting a bucket logging configuration, verifying that the source and target buckets were different.

With this fix, bucket logging configuration settings are rejected when the source and destination are the same, as expected.

This content is not included.Bugzilla:2321568

Ceph Object Gateway no longer crashes due to mishandled kafka error messages

Previously, error conditions with the kafka message broker were not handled correctly. As a result, in some cases, Ceph Objet Gateway would crash.

With this fix, kafka error messages are handled correctly and do not cause Ceph Object Gateway crashes.

This content is not included.Bugzilla:2327774, This content is not included.Bugzilla:2343980

ACL bucket operations now work as expected

Previously, a local variable 'uri' shadowed a member variable with the same name. As a result, a subset of bucket ACL operations would fail.

With this fix, the shadowing local duplicated variable has been removed and ACL bucket operations now work as expected.

This content is not included.Bugzilla:2338149

Target buckets now needs a bucket policy for users to write logs to them

Previously, no permission checks were run on the target bucket for bucket logging. As a result, any user could write logs to a target bucket, without needing specific permissions.

With this fix, a bucket policy must be added on a target to allow specific users to write logs to them.

This content is not included.Bugzilla:2345305

S3 requests no longer rejected if local is listed before external for the authentication order

Previously, S3 requests were rejected when the request is not authenticated successfully by the local authentication engine. As a result, S3 requests using OpenStack Keystone EC2 credentials failed to authenticate with Ceph Object Gateway when the authentication order had local before external

With this fix, S3 requests signed using OpenStack Keystone EC2 credentials successfully authenticate with Ceph Object gateway, even with the authentication order has local listed before external.

This content is not included.Bugzilla:2316975

Ceph Object Gateway internal HTTP headers are no longer sent while transitioning the object to Cloud

Previously, some Ceph Object Gateway internal HTTP header values were sent to the Cloud endpoint, when transitioning the object to Cloud. As a result, some S3 cloud services did not recognize the headers and failed the transition or failed to restore the operation of the objects.

With this fix, internal HTTP headers are not sent to Cloud and transitioning to Cloud works as expected.

This content is not included.Bugzilla:2344731

The radosgw-admin bucket logging flush command now provides works as expected

Previously, using the radosgw-admin bucket logging flush command would return the next lob object name. As a result, the user did not know the name of the log object that was flushed without listing the log bucket.

With this fix, the correct name of the object that was flushed is now returned as expected.

This content is not included.Bugzilla:2344993

Upgrading clusters now fetches notification_v2 topics correctly

Previously, upgrading clusters upgraded bucket notifications to notification_v2. As a result, topics in notification_v2 were not retrieved as expected.

With this fix, notification_v2 topics are retrieved as expected after a cluster upgrade.

This content is not included.Bugzilla:2355272

olh get now completes as expected

Previously, a 2023 fix for a versioning-related bug caused an internal code path to reference an incorrect attribute name for the object logical head (OLH). As a result, an error would emit when running the radosgw-admin olh get command.

With this fix, the internal attribute name has been corrected, ensuring proper functionality.

This content is not included.Bugzilla:2338402

Swift container listings now report object last modified time

Previously, the Ceph Object Gateway Swift container listing implementation was missing the logic to send the last_modified JSON field. As a result, Swift container listings did not report the last modified time of objects.

With this fix, the last_modified JSON field has been added to the Swift container listing response, ensuring that object modification times are correctly reported.

This content is not included.Bugzilla:2343732

Ceph Object Gateway now recognizes additional checksums from their checksum-type specific headers and trailers

Previously, the aws-sdk-go-v2 checksum behavior differed from other SDKs, as it did not send either x-amz-checksum-algorithm or x-amz-sdk-checksum and never included x-amz-decoded-content-length, despite AWS documentation requiring it. As a result, additional checksums were not recognized when sent, and some AWS-chunked requests failed an assertion check for decoded content length with an InvalidArgument error.

With this fix, Ceph Object Gateway can now recognize additional checksums from their checksum-type specific header or trailer. Ceph Object Gateway no longer tests and asserts for decoded content length, as it is unnecessary due to chunk signature calculations.

This content is not included.Bugzilla:2367319

Shadow users for the AssumeRoleWithWebIdentity call are now created within the oidc namespace

Previously, an incorrect method was used to load the bucket stats, which caused the shadow users for AssumeRoleWithWebIdentity call to not be created within the oidc namespace. As a result, users were not able to differentiate between the shadow users and local rgw users.

With this fix, bucket stats are now loaded correctly and the user is correctly created within the oidc namespace. Users can now correctly identify a shadow user that corresponds to a federated user making the AssumeRoleWithWebIdentity call.

This content is not included.Bugzilla:2346829

4.5. Multi-site Ceph Object Gateway

Replicating metadata from earlier versions of Red Hat Ceph Storage no longer renders user access keys as “inactive”

Previously, when a secondary zone running Red Hat Ceph Storage 8.0 replicated user metadata from a pre-8.0 metadata master zone, the access keys of those users were erroneously marked as "inactive". Inactive keys cannot be used to authenticate requests, so those users are denied access to the secondary zone.

With this fix, secondary zone storage replication works as expected and access keys can still authenticate requests.

This content is not included.Bugzilla:2327402

Invalid URL-encoded text from the client no longer creates errors

Previously, the system improperly handled scenarios where URL decoding resulted in an empty key.name. The empty key.name due to invalid URL-encoded text from the client. As a result, an assertion error during the copy operation would occur, and sometimes led to a crash later.

With this fix, invalid empty key.name values are now ignored, and copy operations no longer trigger assertions or causes crashes.

This content is not included.Bugzilla:2356922

Network error code is now mapped correctly

Previously, when one or some of the Ceph Object Gateways in a target zone were down, the HTTP client in the Ceph Object Gateway in the source zone did not map the network connection error code correctly internally. As a result, the client kept attempting to connect to a downed Ceph Object Gateway instead of falling back to other active ones.

With this fix, the network error code is now mapped correctly. The HTTP client in the source zone detects the network error and fails over to communicate with the functioning Ceph Object Gateways in the target zone.

This content is not included.Bugzilla:2275856

sync error trim now runs as expected with optional --shard-id input

Previously, the sync error trim command did not mark the --shard-id option as optional.

With this fix, --shard-id option is recognized as as optional and is marked as optional in the radosgw-admin help.

This content is not included.Bugzilla:2282369

Objects restored from Cloud/Tape now synchronize correctly to remote locations

Previously, objects restored from Cloud/Tape retained their original mtime, making it insufficient for multisite sync checks. As a result, these restored objects were not synchronized to remote locations.

With this fix, a new extended attribute, internal_mtime, is introduced specifically for multisite usage, ensuring that restored objects are synchronized to remote locations when needed.

This content is not included.Bugzilla:2309701

Sync rate now works as expected

Previously, in some cases, an incorrect internal error return caused sync operations to run slower than expected.

With this fix, the error return was fixed and the expected sync rate is sustained.

This content is not included.Bugzilla:2317153

4.6. RADOS

rgw daemons no longer crash due to stack overflows

Previously, large clusters of over 10,000 daemons with a stack-based allocation of variable-length arrays caused a stack overflow. As a result, the Ceph Object Gateway daemons crashed.

With this fix, stack-based allocation of variable-length arrays are no longer used and stack overflows are avoided, with Ceph Object Gateway daemons working as expected.

This content is not included.Bugzilla:2346896

Pool removals now remove pg_upmap_primary mappings from the OSDMap

Previously, deleting pools did not remove pg_upmap_primary mappings from the OSDMap. As a result, pg_upmap_primary mappings were seen but could not be removed, since the pool, and pgid no longer existed.

With this fix, pg_upmap_primary mappings are now from the OSDMap are now automatically removed each time that a pool is deleted.

This content is not included.Bugzilla:2293847

Destroyed OSDs are no longer listed by the ceph node ls command

Previously, destroyed OSDs were listed without any indication of their status, leading to user confusion and causing cephadm to incorrectly report them as stray.

With this fix, the command filters out destroyed OSDs by checking their status before displaying them, ensuring accurate and reliable output.

This content is not included.Bugzilla:2269003

AVX512 support for the ISA-L erasure code plugin is now enabled

Previously, due to an issue in the build scripts, the plugin did not take advantage of the AVX512 instruction set—even on CPUs that supported it—resulting in reduced performance.

With this fix, the build scripts correctly enable AVX512 support, allowing the plugin to utilize the available CPU capabilities for improved performance.

This content is not included.Bugzilla:2310433

Multiple OSDs crashing while replaying bluefs -- ceph_assert(delta.offset == fnode.allocated)

Previously, a fix was implemented to prevent RocksDB’s SST files from expanding, but it contained a bug. As a result, the BlueFS log became corrupted, causing an error that prevented OSD bootup, even though it could be ignored.

With this fix, a flag skips the error, and BlueFS is updated to prevent the error from occurring. Now, the original fix for preventing RocksDB disk space overbloat functions as intended.

This content is not included.Bugzilla:2338097

BlueFS log no longer gets corrupted due to race conditions

Previously, a rare condition between truncate and unlink operations in BlueFS caused the log to reference deleted files. This corrupted the BlueFS log, triggering an assertion failure during OSD startup and resulting in a crash loop.

With this fix, the operations are now correctly sequenced using proper locking, preventing log corruption and eliminating the assertion failure.

This content is not included.Bugzilla:2354192

Chapter 5. Technology Previews

This section provides an overview of Technology Preview features introduced or updated in this release of Red Hat Ceph Storage.

Important

Technology Preview features are not supported with Red Hat production service level agreements (SLAs), might not be functionally complete, and Red Hat does not recommend to use them for production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information on Red Hat Technology Preview features support scope, see https://access.redhat.com/support/offerings/techpreview/.

5.1. The Cephadm utility

New Ceph Management gateway and the OAuth2 Proxy service for unified access and high availability

With this enhancement, the Ceph Dashboard introduces the Ceph Management gateway (mgmt-gateway) and the OAuth2 Proxy service (oauth2-proxy). With the Ceph Management gateway (mgmt-gateway) and the OAuth2 Proxy (oauth2-proxy) in place, nginx automatically directs the user through the oauth2-proxy to the configured Identity Provider (IdP), when single sign-on (SSO) is configured.

This content is not included.Bugzilla:2298666

New Cephadm certificate lifecycle management for improved Ceph cluster security

With this enhancement, Cephadm now has certificate lifecycle management in the certmgr subsystem. This feature provides a unified mechanism to provision, rotate, and apply TLS certificates for Ceph services, supporting both user-provided and automatically generated cephadm-signed certificates. As part of this feature, certmgr periodically checks the status of all certificates managed by Cephadm and issues health warnings for any that are nearing expiration, misconfigured, or invalid. This improves Ceph cluster security and simplifies certificate management through automation and proactive alerts.

This content is not included.Bugzilla:2351287

5.2. Ceph Dashboard

New OAuth2 SSO

OAuth2 SSO uses the oauth2-proxy service to work with the Ceph Management gateway (mgmt-gateway), providing unified access and improved user experience.

This content is not included.Bugzilla:2312560

5.3. Ceph Object Gateway

Bucket logging support for Ceph Object Gateway with bug fixes and enhancements

Bucket logging was introduced in Red Hat Ceph Storage 8.0. Bucket logging provides a mechanism for logging all access to a bucket. The log data can be used to monitor bucket activity, detect unauthorized access, get insights into the bucket usage and use the logs as a journal for bucket changes. The log records are stored in objects in a separate bucket and can be analyzed later. Logging configuration is done at the bucket level and can be enabled or disabled at any time. The log bucket can accumulate logs from multiple buckets. The configured prefix may be used to distinguish between logs from different buckets.

For performance reasons, even though the log records are written to persistent storage, the log object appears in the log bucket only after a configurable amount of time or when reaching the maximum object size of 128 MB. Adding a log object to the log bucket is done in such a way that if no more records are written to the object, it might remain outside of the log bucket even after the configured time has passed.

There are two logging types: standard and journal. The default logging type is standard.

When set to standard the log records are written to the log bucket after the bucket operation is completed. As a result the logging operation can fail with no indication to the client.

When set to journal the records are written to the log bucket before the bucket operation is complete. As a result, the operation does not run if the logging action fails and an error is returned to the client.

You can complete the following bucket logging actions: enable, disable, and get.

Red Hat Ceph Storage 8.1 enhancements introduce several improvements to bucket logging, including support for source and destination buckets across different tenants, suffix/prefix-based key filtering, and standardized AWS operation names in log records. A new REST-based flush (POST) API has been added, along with the bucket logging info admin command for retrieving logging configurations.

Fixes address concurrency issues causing multiple temporary objects, missing object size in certain cases, and retry attributes in race conditions. Additional safeguards now ensure that source and log buckets are distinct and that log buckets do not have encryption. Cleanup mechanisms have been improved to remove pending objects when source buckets are deleted, logging is disabled or reconfigured, or when target buckets are removed. Logging records now include missing fields related to authentication and transport layer information, ensuring more comprehensive logging capabilities.

This content is not included.Bugzilla:2308169, This content is not included.Bugzilla:2341711

Restore objects transitioned to remote cloud endpoint back into Ceph Object gateway using the cloud-restore feature

With this release, the cloud-restore feature is implemented. This feature allows users to restore objects transitioned to remote cloud endpoint back into Ceph Object gateway, using either S3 restore-object API or by rehydrating using read-through options.

This content is not included.Bugzilla:2293539

Chapter 6. Asynchronous errata updates

This section describes the bug fixes, known issues, and enhancements of the z-stream releases.

6.1. Red Hat Ceph Storage 8.1z1

Red Hat Ceph Storage release 8.1z1 is now available. The security updates and bug fixes that are included in the update are listed in the RHSA-2025:11749 advisory.

6.2. Red Hat Ceph Storage 8.1z2

Red Hat Ceph Storage release 8.1z2 is now available. The bug fixes that are included in the update are listed in the RHBA-2025:14015 and RHBA-2025:13981 advisories.

6.3. Red Hat Ceph Storage 8.1z3

Red Hat Ceph Storage release 8.1z3 is now available. The bug fixes that are included in the update are listed in the RHBA-2025:17047 and RHBA-2025:17048 advisories.

6.3.1. Enhancements

​​​​​​This section lists all the major updates, and enhancements introduced in this release of Red Hat Ceph Storage​​​​.​​​​

6.3.1.1. Ceph Object Gateway

​Enhanced conditional operations

​​​​​​This enhancement introduces support for conditional PUT and DELETE operations, including bulk and multi-delete requests. These conditional operations improve data consistency for some workloads.

Note

​​​​​The conditional ​​​​​​InitMultipartUpload​​​​ is not implemented in this release.

​​​​ This content is not included.Bugzilla:2375001, This content is not included.Bugzilla:2383253

Rate limits now applied for LIST and DELETE request

LIST and DELETE requests are sub-operations of GET and PUT, respectively, but are typically more resource-intensive.

With this enhancement, it is now possible to configure rate limits for LIST and DELETE requests independently or in conjunction with existing GET and PUT rate limits. This provides more flexible granularity in managing system performance and resource usage.

This content is not included.Bugzilla:2389281

6.4. Red Hat Ceph Storage 8.1z4

Red Hat Ceph Storage release 8.1z4 is now available. The bug fixes that are included in the update are listed in the the RHSA-2025:21068 and RHSA-2025:21203 advisories.

6.4.1. Known issues

This section documents known issues found in this release of Red Hat Ceph Storage.

6.4.1.1. The Cephadm utility

QAT cannot be used for TLS offload or acceleration mode together with SSL set

Enabling QAT on HAProxy with SSL enabled injects legacy OpenSSL engine directives. The legacy OpenSSL engine path breaks the TLS handshake, emitting the tlsv1 alert internal error error. With the TLS handshake broken, the TLS termination fails.

As a workaround, disable the QAT at HAProxy in order to keep the TLS handshake. Set the configuration file specifications as follows:

  • haproxy_qat_support: false
  • ssl: true

As a result, QAT is disabled and the HAProxy TLS works as expected.

Note

Under heavy connection rates higher CPU usage may be seen versus QAT-offloaded handshakes.

This content is not included.Bugzilla:2373189

6.5. Red Hat Ceph Storage 8.1z5

Red Hat Ceph Storage release 8.1z5 is now available. This release numerous security updates, bug fixes, and a known issue.

6.5.1. Notable bug fixes

This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.

For a full list of bug fixes in this release, see All bug fixes.

6.5.1.1. Ceph File System (CephFS)

The ceph tell command now displays proper error messages for the wrong MDS type​​​​.

​​​​​​ ​​​​​​Previously, the`ceph tell` command did not display a proper error message if the MDS type was incorrect. As a result, the command failed with no error message, and it was difficult to understand what was wrong with the command. ​​​​ ​​​​​​With this fix, the ceph tell command returns an appropriate error message, stating ​​​​​​"unknown <type_name>"​​​ when an incorrect MDS type is used.​​​​

​​​​​​(IBMCEPH-11012)​​​​

​​​​​​Updated subvolume removal workflow to prevent inconsistent states​​​​

​​​​​​ ​​​​​​Previously, removing a subvolume in a full-cluster condition could leave the subvolume in an invalid state.​​​​

​​​​​​With this fix, the subvolume removal workflow has been updated so that metadata is now updated before the UUID directory is moved to the .trash directory. This change ensures that any ENOSPC error is detected during the metadata update, allowing the operation to fail safely and preventing inconsistent state.​​​​

​​​​​​As a result, the system no longer leaves subvolumes in a partially removed or invalid state, and subsequent subvolume operations complete successfully.​​​​

​​​​​​(IBMCEPH-9439)​​​​ ​​​​ .​​​​​​readdr requests now complete successfully with directory listings working as expected ​​​​​​ ​​​​​​Previously, on big-endian systems, a bug in the Ceph MDS caused incorrect encoding of directory fragments. As a result, the CephFS kernel driver received an invalid directory-fragment value. This caused the driver to repeatedly send ​​​​​readdir​​​ requests without completing them. User-initiated ls commands did not finish. ​​​​ ​​​​​​With this fix, the directory-fragment encoding now uses a common endianness format, and the system automatically detects and handles fragments that were created with the previous incorrect encoding.​​​​

​​​​​​(IBMCEPH-12782)​​​​​​​

Early return added to prevent NULL dereference in MDS

Previously, the MDS could crash when a NULL pointer was dereferenced.

With this fix, the logic now returns early when the MDRequestRef is NULL instead of dereferencing it. As a result, crashes caused by this condition are prevented, and MDS stability is improved.

(IBMCEPH-12892)

Improved xattr dump handling to prevent MDS crash

Previously, the MDS could crash when handling xattrs in CInode.cc due to empty bufptr values being dumped.

With this fix, the code now checks whether the buffer contains data before dumping it, and explicitly dumps an empty string when the buffer length is zero. This prevents spurious empty buffer entries and ensures safe handling of xattr values. As a result, xattr dumps are cleaner and more accurate, and the MDS no longer crashes in this scenario.

(IBMCEPH-12738)

6.5.1.2. Ceph Object Gateway (RGW)

​​​​​​Object unlink handling updated for zero-shard configuration​​​​

​​​​​​ ​​​​​​Previously, the system did not correctly handle object unlink operations in specific zero-shard configurations.​​​​

​​​​​​With this fix, the code has been updated to ensure proper handling when both ​​​​​​bucket_index_max_shards​​​​ and the bucket’s num_shards set to ​​​​​0​​​​. As a result, object unlink operations now succeed in this scenario.​​​​ ​​​​​​(IBMCEPH-12702)​​​

Tenant user policy and role-based permissions now work as expected after upgrade

Previously, some policy or role-based permissions involving legacy tenant users behaved differently after upgrading to releases that support IAM accounts. As a result, expected access grants would fail.

With this fix, a configuration option has been introduced to allow backward compatibility with previous version behavior.

​​​​​​(IBMCEPH-12352)​​​

6.5.2. All bug fixes

This section lists a complete listing of all bug fixes in this release of Red Hat Ceph Storage.

Issue keySeveritySummary

IBMCEPH-10423

Critical

Multisite deployment using rgw module is failing with the timeout error on secondary site

IBMCEPH-12352

Critical

Observing 403 error for multi-part request while other requests like ‘cp’ etc are working fine

IBMCEPH-12686

Critical

Syncing stopped "daemon_health: UNKNOWN" post shutdown and recovery of managed cluster [8.1z]

IBMCEPH-12761

Critical

Shallow Clone does not work as expected when an RWX clone is in progress.

IBMCEPH-12844

Critical

pg_autoscaler is calculating correctly but not implementing PG counts and changes, due to high threshold

IBMCEPH-12877

Critical

A few RBD images report error due to incomplete group snapshots on the secondary cluster after workload deployment [8.1z]

IBMCEPH-11812

Important

MGR crashes during CephFS system test due to assertion failure in src/common/RefCountedObj.cc: 14

IBMCEPH-12433

Important

Accessing RGW ratelimit for user fails with error: failed to get a ratelimit for user id: 'UID', errno: (2) No such file or directory

IBMCEPH-12585

Important

cephadm crashes and doesn’t recover with ganesha-rados-grace tool failed: Failure: -126

IBMCEPH-12705

Important

Observing slow_ops on a mon daemon post site down tests in a 3 AZ cluster

IBMCEPH-12732

Important

OSD crashes with ceph_assert(diff ⇐ bytes_per_au[pos])

IBMCEPH-12738

Important

MDS crassed executing asok_command: dump tree with assert ceph::__ceph_assert_fail(char const*, char const*, int, char const*)

IBMCEPH-12782

Important

Application Pod stays in Init state as the CephFS VolumeAttachment doesn’t complete.

IBMCEPH-12803

Important

unable to see old metrics using loki query after upgrade 6 to 7

IBMCEPH-12816

Important

Stale OLH/plain index entries with pending_removal=true in versioned buckets

IBMCEPH-12828

Important

Unexpected error getting (earmark|encryption tag): error in getxattr: No data available [Errno 61]

IBMCEPH-12867

Important

rbd-mirror daemon restart fails to resume partially synced demote snapshot synchronization on secondary [8.1z]

IBMCEPH-12892

Important

ceph-mds crashed - mds-rank-fin

IBMCEPH-12904

Important

ceph-crash not authenticating with cluster correctly

IBMCEPH-12998

Important

Group replayer shutdown can hang in the face of active m_in_flight_op_tracker ops

IBMCEPH-13000

Important

Bug 2411968 : Group replayer shutdown can hang in the face of active m_in_flight_op_tracker ops [8.1z]

IBMCEPH-13002

Important

RPMInspect fails on executable stack

IBMCEPH-13123

Important

CLONE - radosgw-admin 'bucket rm --bypass-gc' ignores refcount (can lead to DL)

IBMCEPH-7933

Important

mon_memory_target is ignored at startup when set without mon_memory_autotune in the config database

IBMCEPH-9439

Important

Clone In-Progress operations start and error out in a loop.

IBMCEPH-11011

Moderate

Error message is not descriptive for ceph tell command

IBMCEPH-11012

Moderate

Error message is not descriptive for ceph tell command

IBMCEPH-12442

Moderate

Prometheus module error causing cluster to go to HEALTH_ERR

IBMCEPH-12579

Moderate

segmentation fault osd : tick_without_osd_lock()

IBMCEPH-12702

Moderate

When "bucket_index_max_shards" is set to 0 in the zone group , and bucket has num_shards 0 , the "object unlink" fails

6.5.3. Security fixes

This section lists security fixes from this release of Red Hat Ceph Storage.

For details about each CVE, see Content from www.cve.org is not included.CVE Records.

  • CVE-2019-10790
  • CVE-2021-23358
  • CVE-2022-34749
  • CVE-2024-31884
  • CVE-2024-51744
  • CVE-2024-55565
  • CVE-2025-7783
  • CVE-2025-12816
  • CVE-2025-26791
  • CVE-2025-47907
  • CVE-2025-47913
  • CVE-2025-52555
  • CVE-2025-58183
  • CVE-2025-66031
  • CVE-2025-66418
  • CVE-2025-66471
  • CVE-2025-68429

6.5.4. Known issues

This section documents known issues found in this release of Red Hat Ceph Storage.

6.5.4.1. Ceph Object Gateway multi-site

Bucket index shows stale metadata after lifecycle expiration in versioned buckets

In rare cases, when lifecycle expiration removes objects from versioned buckets, some omap entries in the bucket index might remain even though the objects have already been removed.

As a result, some omap entries may remain in the bucket index. In cases that many leftover keys accumulate, the following error is emitted: (27) File too large. This inconsistency can affect tools or processes that depend on accurate bucket index listings.

As a workaround:

  1. Scan the bucket for leftover keys:

    radosgw-admin bucket check olh --bucket=BUCKET_NAME --dump-keys --hide-progress
  2. Remove the leftover omap entries.

    radosgw-admin bucket check olh --bucket=BUCKET_NAME --fix

(IBMCEPH-12980)

Chapter 7. Sources

The updated Red Hat Ceph Storage source code packages are available at the following location:

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.