How can I test if my hard disk is going bad?
If there are several I/O errors in /var/log/messages or one simply suspects the hard disks may be failing, smartctl can be a helpful tool in checking them. Hard disks can fail unexpectedly and it is always best to keep recent backups of all important data. Please keep in mind that even if a current or oncoming failure is detected, there may not be enough time to backup the data.
S.M.A.R.T. stands for Self-Monitoring, Analysis and Reporting Technology.
First, enable S.M.A.R.T. support in the BIOS.
Next, install the needed packages to run /usr/sbin/smartctl. In Red Hat Enterprise Linux 4, it is provided by the kernel-utils package. In Red Hat Enterprise Linux 5, it is provided by the smartmontools package.
See if your Hard Disks support S.M.A.R.T.:
smartctl -i /dev/xxx
Replace /dev/xxx with the hard disk of interest when using the commands outlined in this article.
For SATA drives use:
smartctl -i -d ata /dev/xxx
Enable S.M.A.R.T. support with:
smartctl -s on /dev/xxx
Or for SATA drives:
smartctl -s on -d ata /dev/xxx
Running the following command as root can be a quick PASS/FAIL test but more thorough testing discussed below is generally more conclusive:
smartctl -H /dev/xxx
To start a background test run the following as root:
smartctl -t long /dev/xxx
To access the results, use the following command:
smartctl -a /dev/xxx
To learn more about various options that can be used with smartctl view the manual:
man smartctl