[ale] OT: HW failure

Bob Toxen bob at verysecurelinux.com
Wed Dec 24 19:38:40 EST 2003


On Wed, Dec 24, 2003 at 11:28:24AM -0500, tfreeman at intel.digichem.net wrote:
> On Tue, 23 Dec 2003, Geoffrey wrote:
...
> Ok, I'm not sure whether I'm jealous of you, or I should be looking to 
> exchange a slightly used karma for a new one.

> I had four drives die within a year (one after another), in one machine. I 
> didn't worry about guarentees as I was dependent on that machine. Life got 
> much better (and less expensive!) when I changed the controller along with 
> the drive.
In doing failure analysis, look for the constant, i.e., the thing that
"moves with the problem".  In this case, it is your system and physical
environment.

Ensure:
  1. The disk is receiving adequate cooling.  Measure temperature, if
     possible.  Ensure that air flow is not being blocked by cables and
     such, that fans are operating normally, and that the ambient
     temperature surrounding the box is 75F or less, and that there is
     sufficient clearance around vents and that between the box and any
     source of heat, including other systems.

  2. Ensure that there is not an abnormally high amount of particulate
     matter, e.g., dust.

> I'm not an engineer, but I suspect that squirrelly hardware on the other 
> end of the ribbon cable can encourage a hard drive to self-destruct. I 
> have to wonder if such a mechanism doesn't explain some of the horror 
> stories concerning multiple serial drive failures? 

I is.  Enjoy.

Bob



More information about the Ale mailing list