The rpm command may core dump with SIGBUS during and after system startup.

Solution Verified - Updated

Environment

  • Red Hat Enterprise Linux 7
  • rpm package
  • systemd

Issue

  • The rpm command may core dump with a SIGBUS if /var/lib/rpm/__db.002 or __db.003 are zero length files.

For example:

```
# ls -l /var/lib/rpm
total 92252
-rw-r--r--. 1 root root  3317760 Jul  4 10:32 Basenames
-rw-r--r--. 1 root root    16384 Jun 30 15:37 Conflictname
-rw-r--r--. 1 root root   286720 Jul 14 18:10 __db.001
-rw-r--r--. 1 root root        0 Jul 14 18:10 __db.002 <---
-rw-r--r--. 1 root root  1318912 Jul 14 18:11 __db.003 
or
-rw-r--r--. 1 root root    90112 Jul 14 18:10 __db.002
-rw-r--r--. 1 root root        0 Jul 14 18:11 __db.003 <---
```

In this situation, if the following command rpm -qa is ran (or ran with any other options), it core dumps with SIGBUS.

```
# rpm -qa
Bus error
```
  • This issue was found when the rpm command was ran from a script defined in a 3rd party software service unit. That service unit had (amongst other things) After=local-fs.target and DefaultDependencies set to no in [Unit] section of the service file.

Resolution

  • To recover from this situation, you should follow the instructions in the following solution article: How to rebuild RPM database on a Red Hat Enterprise Linux system?

  • Alternatively you can choose to run the following command (this assumes you can guarantee that nobody runs the rpm command while removing these files and that the rpm database was not damaged in any other way).

    # rm /var/lib/rpm/__db.00{1..3}
    
  • If the issue reoccurs, follow the full instructions in the solution article above. Alternatively you can address the issue in the service unit definition and reboot the system (since the files will be removed on the next boot).

  • This issue has been resolved at libdb-5.3.21-24.el7(RHBA-2018:0952) or later. This issue had been being tracked at private This content is not included.RHBZ#1471011.

Root Cause

  • The service definition that saw the rpm command core dump had DefaultDependencies set to no. The default for DefaultDependencies is yes, when you change it to be no, default dependencies will no longer be automatically imported. The only dependencies used will be those explicitly given in the [Unit] section of the service definition. One of the default dependencies is on basic.target, that means that when DefaultDependencies is left at its default a service will not be started before basic.target is reached.

    Important: Red Hat strongly recommends if you change DefaultDependencies to be no you should be skilled in writing unit definitions and be fully aware of what impacts this may have on any commands that you may call. You must provide explicit dependencies that are required for your service to start successfully. You must also understand what else may be executed concurrently based on those dependencies and how that may impact your service (including what races may occur).

In general without due care, careful consideration, and for good reasons DefaultDependencies should not be changed from its default value.

In the transition from local-fs.target to sysinit.target, systemd has a service unit defined that cleans temporary files from the local filesystem (systemd-tmpfiles-setup.service). Amoungst the activities of that service is to remove the files /var/lib/rpm/__db* at boot, that is configured here:

```
# cat /usr/lib/tmpfiles.d/rpm.conf
r /var/lib/rpm/__db.*
```

The race can occur between the 3rd party script running the rpm command and systemd-tmpfiles-setup.service because /var/lib/rpm/__db* files may be removed while the rpm command is attempting to use them. The core dump issue was not seen at every boot. The race can lead to a zero length __db.002 or __db.003 file being created in /var/lib/rpm. This causes all subsequent rpm commands to core dump.

This issue does not occur on RHEL 6 because the files /var/lib/rpm/__db* were removed by the script /etc/rc.d/rc.sysinit before any system startup scripts are ran.

 **Important**: Red Hat also more generally recommends that any script called by a systemd unit that has `DefaultDependencies` set to `no` and creates or uses temporary files (or calls a command that does so) that can be removed by `systemd-tmpfiles-setup.service` should have an explicit dependency of `After=sysinit.target` to prevent potential races cause by removing temporary files during system startup.

The issue in the 3rd party service definition was resolved by changing the local-fs.target to sysinit.target in the After= line (in this particular case DefaultDependencies could not be removed to leave it at the default value).

Components
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.