Our server at Alzatex has been acting slow for a while now, but I have been unable to find a cause until now. With what I found out today, I’m beginning to think that we just have an old, slow hard drive in our RAID6. What do you think?
$ sudo hdparm -tT --direct /dev/sd[abcd]
/dev/sda:
Timing O_DIRECT cached reads: 420 MB in 2.00 seconds = 209.87 MB/sec
Timing O_DIRECT disk reads: 318 MB in 3.01 seconds = 105.60 MB/sec
/dev/sdb:
Timing O_DIRECT cached reads: 492 MB in 2.00 seconds = 245.53 MB/sec
Timing O_DIRECT disk reads: 268 MB in 3.10 seconds = 86.40 MB/sec
/dev/sdc:
Timing O_DIRECT cached reads: 408 MB in 2.01 seconds = 203.34 MB/sec
Timing O_DIRECT disk reads: 146 MB in 3.12 seconds = 46.76 MB/sec
/dev/sdd:
Timing O_DIRECT cached reads: 478 MB in 2.01 seconds = 238.25 MB/sec
Timing O_DIRECT disk reads: 272 MB in 3.01 seconds = 90.50 MB/sec
sdc’s cached read time looks ok, but it’s raw disk read is less than half of the next slowest drive. Next up, I tried looking at the S.M.A.R.T. attributes of each drive. I’ve trimmed the output to only the most interesting attributes.
$ sudo smartctl -A /dev/sda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 240 236 021 Pre-fail Always - 1000
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 23
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13603
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
$ sudo smartctl -A /dev/sdb
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 239 236 021 Pre-fail Always - 1016
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 30
9 Power_On_Hours 0x0032 067 067 000 Old_age Always - 24607
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
$ sudo smartctl -A /dev/sdc
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 2
3 Spin_Up_Time 0x0003 225 224 021 Pre-fail Always - 5741
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 48
9 Power_On_Hours 0x0032 042 042 000 Old_age Always - 42601
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 2
$ sudo smartctl -A /dev/sdd
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 239 238 021 Pre-fail Always - 1025
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 37
9 Power_On_Hours 0x0032 064 064 000 Old_age Always - 26475
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
First, a quick review of S.M.A.R.T. attributes for the uninitiated. RAW_VALUE is the actual value of the attribute such as degrees Celsius or hours powered-on whereas VALUE is the attribute normalized to a scale of 1-253. The normalized value is used for easy analysis without much regard to the meaning of the attribute and uses 1 to represent the worst case scenario and 253 for the best case. RAW_VALUE, on the other hand, often increases when growing worse, like number of bad sectors. WORST is the lowest recorded value for VALUE. There are two flavors of attributes, old age and pre-fail. When a pre-fail attribute VALUE crosses the manufacturer’s defined threshold (THRESH), it means failure is imminent. Old age attributes just indicate wear and tear and don’t usually have a threshold.
After reviewing the S.M.A.R.T. attributes, there are definitely issues with sdc. This is the only drive showing a non-zero Raw_Read_Error_Rate as well as Offline_Uncorrectable. The Power_On_Hours, which really just indicates age, is nearly double of the second oldest drive. That’s not necessarily a problem, but what really worries me is that the Spin_Up_Time is much higher than other drives. The Start_Stop_Count, is also the highest on sdc.
Through the normal wear and tear of a hard drive, sectors can go bad and become unusable. Modern hard drives normally have a number of hidden, unused sectors reserved for this situation. When the hard drive controller detects a failure to update a sector, it will remap the sector address to one of the unused reserved sectors instead. When this happens, but there are no more free reserved sectors for remapping, you get an uncorrectable sector. When the value of Offline_Uncorrectable goes above zero, it means you have more bad sectors than were originally reserved by the manufacture and so can no longer be used.
Currently, none of the attributes have crossed a threshold indicating potential failure, and, interestingly, attributes 1 and 198 don’t appear to be reflected in the VALUE field. I also run regular, nightly S.M.A.R.T. tests on all drives and no issues have been reported from the tests. There is also nothing in the system logs of any recent errors or warnings; the only issue is the server is, overall, a little sluggish. Here’s a snippet of the self-test log as retrieved from the hard drive itself:
$ sudo smartctl -l selftest /dev/sdc
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
...
# 8 Short offline Completed without error 00% 42414 -
# 9 Extended offline Completed without error 00% 42395 -
...
If there were any failed attributes or tests, I would normally get an email about it as well as have the event logged by syslog. I have the smartd daemon installed and configured to run regular tests which something similar to this:
$ cat /etc/smartd.conf
/dev/sda -a -o on -S on -s (S/../.././02|L/../../6/03) -m root
/dev/sdb -a -o on -S on -s (S/../.././02|L/../../6/03) -m root
/dev/sdc -a -o on -S on -s (S/../.././02|L/../../6/03) -m root
/dev/sdd -a -o on -S on -s (S/../.././02|L/../../6/03) -m root
This says enable S.M.A.R.T and monitor hard drives /dev/sd[a-d]. Run a short test every night at 2 AM and an extended (long) test every Sunday at 3 AM. If there are any warnings or errors, report via syslog and send an email to root. I normally have most daemons send critical mail like this to root and make root an alias that redirects the email to all the primary system administrators.
Well, next up, I plan to take a look at running a full-fledged benchmark on the drives and replacing any drives that don’t quite meet expectations.