The rpm command may core dump with SIGBUS during and after system startup.
Environment
- Red Hat Enterprise Linux 7
- rpm package
- systemd
Issue
- The rpm command may core dump with a SIGBUS if
/var/lib/rpm/__db.002or__db.003are zero length files.
For example:
```
# ls -l /var/lib/rpm
total 92252
-rw-r--r--. 1 root root 3317760 Jul 4 10:32 Basenames
-rw-r--r--. 1 root root 16384 Jun 30 15:37 Conflictname
-rw-r--r--. 1 root root 286720 Jul 14 18:10 __db.001
-rw-r--r--. 1 root root 0 Jul 14 18:10 __db.002 <---
-rw-r--r--. 1 root root 1318912 Jul 14 18:11 __db.003
or
-rw-r--r--. 1 root root 90112 Jul 14 18:10 __db.002
-rw-r--r--. 1 root root 0 Jul 14 18:11 __db.003 <---
```
In this situation, if the following command rpm -qa is ran (or ran with any other options), it core dumps with SIGBUS.
```
# rpm -qa
Bus error
```
- This issue was found when the rpm command was ran from a script defined in a 3rd party software service unit. That service unit had (amongst other things)
After=local-fs.targetandDefaultDependenciesset tonoin[Unit]section of the service file.
Resolution
-
To recover from this situation, you should follow the instructions in the following solution article: How to rebuild RPM database on a Red Hat Enterprise Linux system?
-
Alternatively you can choose to run the following command (this assumes you can guarantee that nobody runs the rpm command while removing these files and that the rpm database was not damaged in any other way).
# rm /var/lib/rpm/__db.00{1..3} -
If the issue reoccurs, follow the full instructions in the solution article above. Alternatively you can address the issue in the service unit definition and reboot the system (since the files will be removed on the next boot).
-
This issue has been resolved at libdb-5.3.21-24.el7(RHBA-2018:0952) or later. This issue had been being tracked at private This content is not included.RHBZ#1471011.
Root Cause
-
The service definition that saw the rpm command core dump had
DefaultDependenciesset tono. The default forDefaultDependenciesisyes, when you change it to beno, default dependencies will no longer be automatically imported. The only dependencies used will be those explicitly given in the[Unit]section of the service definition. One of the default dependencies is onbasic.target, that means that whenDefaultDependenciesis left at its default a service will not be started beforebasic.targetis reached.Important: Red Hat strongly recommends if you change
DefaultDependenciesto benoyou should be skilled in writing unit definitions and be fully aware of what impacts this may have on any commands that you may call. You must provide explicit dependencies that are required for your service to start successfully. You must also understand what else may be executed concurrently based on those dependencies and how that may impact your service (including what races may occur).
In general without due care, careful consideration, and for good reasons DefaultDependencies should not be changed from its default value.
In the transition from local-fs.target to sysinit.target, systemd has a service unit defined that cleans temporary files from the local filesystem (systemd-tmpfiles-setup.service). Amoungst the activities of that service is to remove the files /var/lib/rpm/__db* at boot, that is configured here:
```
# cat /usr/lib/tmpfiles.d/rpm.conf
r /var/lib/rpm/__db.*
```
The race can occur between the 3rd party script running the rpm command and systemd-tmpfiles-setup.service because /var/lib/rpm/__db* files may be removed while the rpm command is attempting to use them. The core dump issue was not seen at every boot. The race can lead to a zero length __db.002 or __db.003 file being created in /var/lib/rpm. This causes all subsequent rpm commands to core dump.
This issue does not occur on RHEL 6 because the files /var/lib/rpm/__db* were removed by the script /etc/rc.d/rc.sysinit before any system startup scripts are ran.
**Important**: Red Hat also more generally recommends that any script called by a systemd unit that has `DefaultDependencies` set to `no` and creates or uses temporary files (or calls a command that does so) that can be removed by `systemd-tmpfiles-setup.service` should have an explicit dependency of `After=sysinit.target` to prevent potential races cause by removing temporary files during system startup.
The issue in the 3rd party service definition was resolved by changing the local-fs.target to sysinit.target in the After= line (in this particular case DefaultDependencies could not be removed to leave it at the default value).
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.