[ale] harddrive errors

Greg runman at speedfactory.net
Tue Jun 7 09:27:37 EDT 2005


I have been running 3 Maxtor 200 GB drives in a RAID 5 array (using mdadm
and software RAID) on Libranet (a Debian derived OS) with loads of errors
for about 2-3 years.  I just ignored them.  Upon boot up they all produce a
huge "array" (get it  ??) of errors that scrolls by several screens ... and
then they boot up and work fine.  I have yet to experience any detectible
data loss.  All drives passed Maxtor's diagnostic suit just fine.  Upon some
smartctl /smartmon tools / whatever research last night it seems that there
is room for error since not all drives use SMART in the same way.  However I
don't recall the exact drive models that are the problem exceptions.

So.  Your first drive might have been ok - it's just how your system works.
But as it's your system it's eventually your call.

Good Luck,

Greg 

-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of James P.
Kinney III
Sent: Tuesday, June 07, 2005 8:36 AM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] harddrive errors

Check the drive on a different IDE controller and if possible a different
motherboard. It may be an issue with the chipset.

Hardware issues. Argh.

On Tue, 2005-06-07 at 01:10 -0400, John Wells wrote:
> OK....well....I journeyed to Circuit City today and purchased another 
> identical drive (Seagate 120GB Barracuda ST3120026A-RK) while erasing 
> the other before return.
> 
> I've only had it up for 4 hours, and already smartctl is reporting 
> values under RAW_VALUE for Raw_Read_Error_Rate and Seek_Error_Rate.  
> I'm starting to question whether these numbers are even an accurate
measure or not.
> See smartctl output below.  Thoughts?
> ----
> sudo smartctl -a /dev/hda
> smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Home page is 
> http://smartmontools.sourceforge.net/
> 
> === START OF INFORMATION SECTION ===
> Device Model:     ST3120026A
> Serial Number:    5JT5KQVJ
> Firmware Version: 8.01
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   6
> ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
> Local Time is:    Tue Jun  7 01:00:39 2005 EDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION === SMART overall-health 
> self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x82) Offline data collection activity
>                                         was completed without error.
>                                         Auto Offline Data Collection:
> Enabled.
> Self-test execution status:      (   0) The previous self-test routine
> completed
>                                         without error or no self-test 
> has ever
>                                         been run.
> Total time to complete Offline
> data collection:                 ( 430) seconds.
> Offline data collection
> capabilities:                    (0x5b) SMART execute Offline immediate.
>                                         Auto Offline data collection 
> on/off support.
>                                         Suspend Offline collection upon
new
>                                         command.
>                                         Offline surface scan supported.
>                                         Self-test supported.
>                                         No Conveyance Self-test supported.
>                                         Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
>                                         power-saving mode.
>                                         Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
>                                         General Purpose Logging supported.
> Short self-test routine
> recommended polling time:        (   1) minutes.
> Extended self-test routine
> recommended polling time:        (  85) minutes.
> 
> SMART Attributes Data Structure revision number: 10 Vendor Specific 
> SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED 
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000f   073   072   006    Pre-fail  Always  
>     -       84924419
>   3 Spin_Up_Time            0x0003   098   098   000    Pre-fail  Always  
>     -       0
>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always  
>     -       0
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always  
>     -       0
>   7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always  
>     -       452783
>   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always  
>     -       4
>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always  
>     -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always  
>     -       5
> 194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always  
>     -       31
> 195 Hardware_ECC_Recovered  0x001a   073   072   000    Old_age   Always  
>     -       84924419
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always  
>     -       0
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline 
>     -       0
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always  
>     -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline 
>     -       0
> 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always  
>     -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1 No self-tests have 
> been logged.  [To run self-tests, use: smartctl -t]
> 
> 
> SMART Selective self-test log data structure revision number 1  SPAN  
> MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute
delay.
> 
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
-- 
James P. Kinney III          \Changing the mobile computing world/
CEO & Director of Engineering \          one Linux user         /
Local Net Solutions,LLC        \           at a time.          /
770-493-8244                    \.___________________________./
http://www.localnetsolutions.com

GPG ID: 829C6CA7 James P. Kinney III (M.S. Physics)
<jkinney at localnetsolutions.com> Fingerprint = 3C9E 6366 54FC A3FE BA4D 0659
6190 ADC3 829C 6CA7



More information about the Ale mailing list