SMART
What is SMART?I/O Speed
Locating bad sectors in an LVM2 partition
smartctl commands
smartd
SATA
SATA and hdparmPartitioning and Formatting
SATA and NCQ
Choice of file systems
Repartitioning
Spin-down
SCSI Emulation
Moving to a new drive checklist
hdparm was very slow. Then I found a lot of these errors in /var/log/messages:
hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=357843, high=0, low=357843, sector=357839
ide: failed opcode was: unknown
end_request: I/O error, dev hdb, sector 357839
Some on the web had seen these messages before and advised they spelled disaster, that the disk was close to death and/or the power supply was bad. Simply incorrect. The disk's internal drive diagnostics were telling me that the problem was a bad sector the drive could not recover on its own.
Internal drive diagnostics? Recover on its own? How's that work?
Most modern disks have built in monitoring software, called SMART, that makes predictions of future failure based on current and past operation. SMART monitors both the disk as a whole is monitored and individual disk sectors (usually 512 bytes). When a disk sector is nearing failure, the drive automatically moves the data to a spare sector. How does it know that failure is imminent? When you store 512 bytes, more than 512 bytes are actually used by the drive. The extra data are redundant, and are used to verify the integrity of your data (like a checksum), and there is enough redundancy that the drive can actually correct some number of read errors. Although your main processor never sees the errors, the drive knows whether corrections were necessary and how many. More corrections mean more problems with a sector, and sector failure is coming. CDs and DVDs do this, too, except they can't move data out of bad sectors.
My problem arose because a
sector suffered more damage than could be corrected before it was next
read. Because the drive hadn't visited the sector in a while, suddenly
there were too many errors to correct. The data was gone, and the drive
can't deal with it. The solution was to manually locate the damaged
disk sector and write new data into it. When I replaced the sector
contents, I gave the drive
the opportunity to relocate the sector. The drive will insure the
bad sector will never be used again.
To locate the bad sector, I used smartmontools' smartctl. It verified the bad sector location, and verified that I corrected the problem. I now use smartd to periodically test each drive and notify me of health changes. smartctl also told me that the disk had been on for 7,839 hours (nearly a year), had been powered on 535 times, temperatures had been well within normal operating limits, and that this disk was relatively young.
To locate the bad sector, use the Bad Block How-To on the smartmontools site. The instructions apply to an ext2/ext3 filesystem formatted directly on a disk partition. My problem was made slight more complex because my partitions are managed by LVM2.
First off, read the Bad Block How-To and understand how it works without LVM. Then do this:
Follow the How-To's First Step to locate the disk partition containing the bad sector. Record the name of the partition containing the bad sector and the partition's starting sector, S, and the sector size, T (usually 512 bytes). If the partition type is "Linux LVM", don't bother looking in /etc/fstab or calculating the sector offset.
In my case, my disk partition is /dev/hdb1, S = 63 sectors, T = 512 bytes, and my bad sector is L = 357843.
Locate the physical volume associated with the disk partition. Refer to /etc/lvm/backup/volume-group-name and record the extent_size and pe_start. This is the offset from S where the first logical volume starts.
In my case, extent_size = 8192 and pe_start = 384
Locate the logical volume containing the bad sector. This is the difficult part. Logical volumes may not be contiguous, striped across multiple physical volumes, and/or not "linear". That makes things complicated. Fortunately (for me), the bad sector was located in a contiguous logical volume which was not striped. It helps to convert the logical volume's start_extent and extent_count from LVM extents into sectors:
| sector-offset = extents * extent_size | (1) |
If your logical volume is contiguous, the bad sector's offset from the beginning of the filesystem can be calculated with:
| SFS = L - SLV - pe_start - S | (2) |
In my case, this works out to SFS = 357843 - 0 - 384 - 63 = 357396.
If your logical volume is non-contiguous, equation (2) is a good start, but not complete. It produces an offset further into the file system than the sector actually is. You must subtract the amount of space between contiguous regions (because these are gaps not used by your filesystem). Use equation (1) to figure out how many sectors to subtract, and then convert these into file system blocks with (2). If your non-contiguous regions do not follow each other (that is, a later portion of the file system preceeds an earlier portion on physical disk), your math will be more complicated. I suggest drawing a diagram of the physical disk, and working out where your logical volumes are located.
Find your file system's block size, B. See the How-To's Second Step. Specify your LVM file system.
In my case, this means running tune2fs -l /dev/back/backup | grep Block, resulting in B = 4096.
Calculate the File System Block containing the bad sector. This step is similar to the How-To's Third Step, but in our case the sector offset calculation has already been done.
| b = int(SFS * T / B) | (3) |
In my case, this works out to b = int(357396 * 512 / 4096) = int(44674.5) = 44674.
That's it! Proceed from here with the How-To's Fourth Step.
Caution: Read through the entire How-To before modifying your disk. There's a nice part at the end which shows how to test a range of 70 sectors around the detected bad sector:
[root] # export bad=357396
[root] # export i=$((bad-70))
[root] # while [ $i -lt $((bad+70)) ]; do
> echo $i
> dd if=/dev/back/backup of=/dev/null bs=512 count=1 skip=$i
> let i+=1
> done
Also useful is to verify that the file you found is actually the one containing the bad sector. Don't modify a sector which you cannot confirm as bad! The How-To suggests using md5sum filename. Simply accessing the file with md5sum will result in error messages appended to /var/log/messages
After completing your repair, run smartctl -t long /dev/hd? to verify that you have no more trouble waiting. It could take several hours, but you can use the disk while it runs.
| smartctl -i /dev/hda | Prints a bunch of drive info, including whether SMART is currently enabled. |
| smartctl -d ata -i /dev/sda | "-d ata" is needed for SATA drive. |
| smartctl -H /dev/hda | Gives a simple overall health indication. Bad health means imminent failure, take immediate action. |
| smartctl -t short /dev/hda | Start short test. Check back for results later. |
| smartctl -l selftest /dev/hda | Display self-test log. |
| smartctl -l error /dev/hda | Display error log. Five most recent non-trivial errors are shown. These are never cleared. |
| smartctl -A /dev/hda | Return vendor-specific SMART attributes. VALUE is the current normalized attribute value. WORST is the lowest recorded VALUE. THRESH is the service limit -- pay attention when VALUE or WORST approach THRESH. |
smartd is a daemon which periodically runs tests with smartctl and logs any errors found. See /etc/smartd.conf
Check disk I/O with hdparm -tT /dev/hda
| Drive | Device | Cached reads | Buffered disk reads |
| Seagate Barracuda 300 GB (IDE) | /dev/hda | 1.8 Gb/sec | 65.6 MB/sec |
| Western Digital Caviar 250 GB (IDE) | /dev/hdb | 1.8 Gb/sec | 57.9 MB/sec |
| Seagate Barracuda 500 GB (SATA) | /dev/sda | 1.8 Gb/sec | 60.6 MB/sec |
Enabling DMA (-d1) is the only option that makes a difference. Other options (e.g. UDMA) are set to optimal values by the kernel driver. Gentoo configuration file: /etc/conf.d/hdparm. Playing DVDs requires DMA enabled.
SATA: I upgraded my older 250 Gb IDE drive to a 500 Gb SATA drive. The kernel did not recognize it. To get it to work, I did the following:
I encountered a lot of trouble with my system after installing a SATA drive. I traced the problem to hdparm. Use of hdparm for this drive caused an IRQ storm (an rapid, unending stream of interrupts from the device with no interrupt handler in the kernel). The kernel disables the IRQ after 100,000. This caused trouble for my USB printer because it was sharing the same IRQ. Solution: don't use hdparm on SATA drives! I removed my hdparm boot-up directives, and replaced them with the kernel boot parameters "ide0=ata66 ide1=ata66" (because I am using 80 conductor cables) -- results in equivalent performance.
Note: Use sdparm for SCSI/SATA drives. However, hdparm -tT is OK for SATA.
NCQ = Native Command Queueing. This feature allows the kernel to send multiple outstanding requests to the disk drive rather than waiting for each one to complete before sending the next. This allows the kernel in some cases to return control to the application sooner, resulting in better performance. If your hardware supports it, this is a good thing to enable.
My SATA disk claims to support NCQ, but my motherboard does not. A disk controller supporting full AHCI is required to utilize NCQ. My MB has an Intel ICH5 controller, and this does not support full AHCI.
Everyone wants a high performance file system. Linux's default file system, ext3, is not new technology, and several others (reiserfs, JFS2, XFS) are faster. How much faster? Various benchmarks are available, and they all indicate "somewhat faster", but not extraordinarily so. Here's a recent benchmark.
Which one to choose? I use ext3 for these reasons:
With my new, larger SATA drive installed, both of my IDE drives are now used for nightly backups. I don't need them spinning otherwise: they create extra noise and heat.
New drives on the market are larger, faster, and cheaper than what you've got now. How to move your entire system to a new drive?
| Copyright © 2003-2007 Craig Lawson | ||
| Index | ![]() |