[ale] harddrive errors
John Wells
jb at sourceillustrated.com
Tue Jun 7 11:00:45 EDT 2005
Greg said:
> is room for error since not all drives use SMART in the same way. However
I think this is the key. I came away with the same conclusion after
reading the following article:
http://www.linuxjournal.com/article.php?sid=6983
I posted my dilemma to the smartmontools mailing list...two responses I
received are inline below. Seems you can't simply always rely on the
output. I'm returning the first drive anyway due to the 1 ATA error I
received, but I may keep the second and just keep a close eye on it.
Thanks,
John
-----------------------------------------------------
On Tue, Jun 07, 2005 at 01:19:50AM -0400, John Wells wrote:
> pulled it. On thing I noted is that it had quite large numbers in the
> Raw_Read_Error_Rate and Seek_Error_Rate fields, and I had been told this
> was indicative of a major problem.
Some drives show huge values for Raw_Read_Error_Rate and
Hardware_ECC_Recovered, this is normal. It only becomes meaningful once
it's normalised, which the drive does internally by some formula it only
knows itself. This is my understanding anyway. Not sure about
Seek_Error_Rate.
Your most important values are Reallocated_Sector_Ct,
Current_Pending_Sector, Offline_Uncorrectable. They should read 0. Some
number for Reallocated_Sector_Ct is acceptable, let's say 1 per month of
age, though there'll always be argument about this. One supplier told me
drives will be replaced if they show any reallocated sector during the
warranty period.
UDMA_CRC_Error_Count indicates problems with the cabling and/or IDE /
drive electronics.
Power_On_Hours, Power_Cycle_Count are only informative, so is
Temperature_Celsius though if it shows 80?C you know you have a big
problem.
Volker
-----------------------------------------------------
On Tue, Jun 07, 2005 at 01:19:50AM -0400, John Wells wrote:
[snip]
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED
> WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 073 072 006 Pre-fail
> Always
> - 84924431
Both my and my sister's hard drives (both Seagates) increment this
with every sector read.
> 3 Spin_Up_Time 0x0003 098 098 000 Pre-fail
> Always
> - 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
> Always
> - 0
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
> Always
> - 0
> 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail
> Always
> - 476650
Our drives increment this with every seek.
> 9 Power_On_Hours 0x0032 100 100 000 Old_age
> Always
> - 4
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
> Always
> - 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
> Always
> - 5
5 power cycles in 4 hours?
> 194 Temperature_Celsius 0x0022 031 040 000 Old_age
> Always
> - 31
31 degrees celsius
> 195 Hardware_ECC_Recovered 0x001a 073 072 000 Old_age
> Always
> - 84924431
My sister's drive (a ~3 year old Seagate 40GB 7200RPM drive) increments
this with every sector read.
Looks about right for a windows install - fdisk, format and install
would probably read or verify about 350,000,000 sectors on a 120GB drive
(note that the values wrap around at 268,435,456).
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
> Always
> - 0
> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
> Offline
> - 0
> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
> Always
> - 0
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
> Offline
> - 0
> 202 TA_Increase_Count 0x0032 100 253 000 Old_age
> Always
> - 0
Ben
-----------------------------------------------------
More information about the Ale
mailing list