[ale] Which large capacity drives are you having the best luck with?

Greg Freemyer greg.freemyer at gmail.com
Thu Jan 6 10:30:40 EST 2011


On Thu, Jan 6, 2011 at 6:23 AM, Pat Regan <thehead at patshead.com> wrote:
> On Wed, 5 Jan 2011 23:31:30 -0500
> Greg Freemyer <greg.freemyer at gmail.com> wrote:
>
>> Pat,
>>
>> A disk drive physical sector is the equivalent of a network packet.
>>
>> It has a header, payload, footer.
>>
>> And it has exactly one CRC / ECC value as I understand it.
>>
>> The new "Advanced Format" drives from WD have 4KB physical sectors.
>> The benefit being less waste to overhead.  ie. Only one header/footer
>> per 4KB instead of every 512 bytes.  The ECC is bigger, but not 8x
>> bigger.
>
> I would be at least slightly surprised if the firmware performed a
> single ECC operation on the entire 512 or 4096 byte sector.
>
> I'd be more likely to expect the ECC to be computed against smaller
> pieces of the sector.
>
> I'm not very well versed in ECC algorithms so I could be very wrong and
> we don't get good documentation on the data structures that are
> actually being stored on a hard drive, though.

Read the WD paper on Advanced Format drives.  iirc, they were pretty
explicit about it being a single ECC for the entire 4KB physical
sector.  In fact it was a driving force for the change.

>> Do you have a rough GB / min rate for the long test?
>>
>> With modern drives, I can dd if=/dev/zero of=/dev/sda bs=4k  at about
>> 6GB / min.  I can't say I've done a lot of the long tests, but that
>> seems about the same performance.
>>
>> My 250GB drive says 92 minutes to run the long test.  That's less than
>> 3GB/min, so plenty of time to do a full surface scan.
>>
>
> I'm pretty certain that the 1 TB drives I ran a long test on recently
> took a similar amount of time to run, about 1.5 hours or less.  I
> wouldn't be surprised if the type and quality of the scans vary by
> manufacturer and model.

Roughly 10 GB/min.  Surprisingly fast, but not outlandishly fast.
I'll stick with my belief that a long self test is a surface scan.

>> For files I control, I tend to keep two copies of important files on a
>> minimum of 2 media, often 3 or more.  But I've had 2 drives fail
>> within hours of each other!  (both young drives during the height of
>> Seagate's problems.  Fortunately, replacing the controller card on one
>> of the drives brought it back to life.)
>
> I always recommend pairing up drives from different manufacturers.

I used to not, but now I do.

>> I actually like the idea of putting 20 or 30 or more operational hours
>> on a drive before putting real data on it.  We write a single pass of
>> zeros to new drives before putting them in service, but a couple years
>> ago with Seagate, that was not enough of a burn in.  Maybe 3 or 4
>> passes would have done it.
>
> I really wish hard drives were more predictable.  I have a large
> percentage of drives die in the first few days of service, in the first
> few weeks/months of service, and also after a year.  I just have no
> trust in any hard drives :).

The more you work with hdds, the less you trust them!

Same for cables, controllers, ram, etc.

I find taking a 100 GB or so of large files with known MD5 and copying
it to another drive and confirming the MD5 is a pretty good hardware
test.  If you have a bad component in the mix, the hash verify will
fail.  Then its time to track down the bad component.  (often not the
drive in my experience.)  By chance, working with a collection of
100GB or so of 2GB files is routine for our lab, so we do this pretty
routinely as part of our workflow.

Admittedly some RAM failures can slip by the above, so we add in a
memtest++ on regular basis.

> I was so happy when I switched out my laptop to a proper SSD.  I
> figured I wouldn't have to restore from backups as often...  I've RMAed
> this drive twice already, probably controller failures each time.
>
> Pat

Greg



More information about the Ale mailing list