[ale] ATL Colocation and file server suggestions
Ken Ratliff
forsaken at targaryen.us
Tue Jan 20 12:55:35 EST 2009
On Jan 20, 2009, at 8:58 AM, Jeff Lightner wrote:
> Seeing two mirrors drop out in RAID10 shouldn't be an issue so long as
> you have two spares because it should be able to recreate both mirrors
> as it is both mirrored and parity striped. (Losing two in RAID1
> would
> of course be an issue [assuming you didn't have a 3rd mirror]). What
> hardware array was being used that dropping two mirrors caused a
> problem
> for RAID10?
Maybe I didn't say that clearly enough - I've seen a RAID 10 lose both
drives of a single mirror (in other words, the stripe set was broken,
total data loss, chaos, screaming, and finally a restore from backup!)
multiple times. And yes, I felt like God hated me, along with a few
other supernatural entities.
> I'll have to say that even if having two mirrors drop in RAID10 were a
> problem it must be a sign that God hates the person that had it if the
> two drives that failed (assuming it was only two) happened to be the
> ones that were mirrors of each other. We've been using RAID10 for the
> last 4 years since our last major RAID5 fiasco and have not seen
> such a
> failure.
Hehe, I've noticed that RAID level preference seems to be alot like OS
preference. You are shaped by your experiences and react based on
them. I don't trust RAID 10 (though I'll use it when I can't use
RAID5, ie for a database server) and I've never had that much of a
problem with RAID5. The closest thing was a 6TB array in which a drive
died and another one was showing DRIVE-ERROR, but it made it through
the first rebuild to replace the dead drive, and then through the
second to replace the one showing error like a champ.
If I had my way, I'd yank the drives out of the servers entirely, just
build a nice sexy SAN and export disks to the servers via iSCSI. But
that's.... expensive.
> -----Original Message-----
> From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of
> Pat
> Regan
> Sent: Tuesday, January 20, 2009 2:56 AM
> To: ale at ale.org
> Subject: Re: [ale] ATL Colocation and file server suggestions
>
> Ken Ratliff wrote:
>>> However, software RAID 1, 10 is excellent and performance compatible
>>> with a hardware card.
>>
>> I still prefer to do RAID 10 on hardware. I've found software raid to
>
>> be pretty finicky, drives dropping out of the array for no good
>> reason, and you don't notice it until the slight performance hit for
>> the rebuild makes you go 'hm.'
>
> If drives are randomly dropping out of Linux software RAID something
> is
> wrong. I can only recall having two machine that had random drives
> dropping out of MD devices. Both showed errors when running
> memtest86.
> One was a bad CPU, I believe the other was bad RAM (IIRC).
>
>> I actually don't like RAID 10 at all. I'd rather toss the 4 drives
>> into a RAID5 and get more space. Sure, a RAID 10 will allow you to
>> survive 2 dead drives, as long as it's the right 2 drives. I've seen
>> both drives of one mirror fail in a RAID 10 a few times, and that has
>
>> pretty much the same result as 2 dead drives in a RAID 5.
>
> Redundancy isn't the first reason to choose RAID 10 over RAID 5. If
> it
> were, everyone would just choose RAID 6 since that would let you lose
> any two drives.
>
> RAID 5 has a terribly write performance problem. Doing a random
> uncached write to a RAID 5 involves a read and a write to one stripe
> on
> every drive.
>
>> Software RAID1 I have no problem with though. It's quick, easy, the
>> performance hit is negligible unless you have something that's really
>
>> pounding the disk i/o and as someone else mentioned, being able to
>> split the mirror and use them as fully functional drives does
>> occasionally have it's uses.
>
> Hardware RAID 1 shouldn't have a write performance penalty. Software
> RAID 1 (or 10) requires double the bus bandwidth for writes. I can't
> speak for all implementations, but Linux MD RAID 1 spreads reads out
> over all drives in the raid set.
>
>> Yeah, we found out the hard way that software RAID5 is a very very
>> bad
>
>> idea, especially if you're running it on a high activity web server.
>> After enough times of having a drive in software raid5 die before
>> you're done rebuilding from the previous drive failure, you kind of
>> learn that maybe this isn't such a good idea (or you tell the night
>> crew to turn apache off so that the array can rebuild in peace, but
>> that's not something properly spoken of in public!). The performance
>> hit for software RAID5 just isn't worth implementing it.
>
> Your slow rebuilds likely had nothing to do with the performance of
> software RAID 5. I would imagine you needed to tweak
> '/proc/sys/dev/raid/speed_limit_min' up from the default of 1MB/sec.
>
> There is very little reason for a hardware controller to beat Linux MD
> at RAID 5, especially on modern hardware. It only requires one more
> drive worth of bus bandwidth than a hardware controller would require.
> Processors have always been able to compute parity faster than current
> hardware cards. dmesg on my laptop tells me that I can compute RAID 6
> parity at 2870 megabytes per second.
>
> I am not saying software RAID is for everyone. It has other
> advantages
> besides cost, but if you have a large budget those advantages aren't
> as
> useful :).
>
>> Now with that being said, no form of RAID is truly safe. I had a
>> server today drop both drives in one of it's RAID1's. They were older
>
>> 36 gig SCSI's, so it was about time anyway, but losing both of them
>> meant I got to spend time flattening the box and reinstalling it.
>> This
>
>> is also why I try to avoid using drives from the same manufacturer
>> and
>
>> batch when building arrays, as well. If you don't, you better pray to
>
>> god that the rebuild completes before the next one dies. It's said
>> that RAID is no substitute for a proper backup, and that's true. (And
>
>> my life being somewhat of an essay in irony, the box that dropped
>> both
>
>> drives in the mirror today was being used as a backup server.)
>
> This paragraph reminded me of three(!) things. RAID is not a backup.
> You know this, but lots of people don't.
>
> Second, have you ever had the annoyance of replacing a failed drive
> with
> another of the same make/model and the replacement drive is in fact
> smaller than the failed drive? Whenever I use software RAID I
> sacrifice
> a few percent off the end of the drive just to keep this from
> happening.
>
> Which reminds me of another aspect of software RAID that has been
> helpful for me in the past. Software RAID is managed at the partition
> level. On smaller boxes I've often set up a relatively small RAID 1
> or
> 10 at the front of the drives and RAID 5 on the rest.
>
> My home media server is set up like this, and so is my person Xen host
> that I have in a colo. I'm very budget conscious when I spend my own
> dollars. The Xen server has an uptime of 300 days right now with no
> RAID failures :).
>
>> (Also, I'm not preaching at you, Jim, I'm sure you know all this
>> crap,
>
>> I'm just making conversation!)
>
> I like conversation, and I could just as easily be on your side of
> this
> one. :)
>
>>> RAID 1 recovery is substantially quicker and drives
>>> are low cost enough to not need the N-1 space of RAID 5.
>>
>> All depends on your storage needs. We have customers with 4 TB
>> arrays,
>
>> 6 TB arrays, and one with an 8.1 TB array (which presents some
>> interesting challenges when you need to fsck the volume.... why we
>> used reiser for the filesystem on that array, I have no idea). Those
>> are a little hard to do in RAID 1 :)
>
> I'm up near 4 TB at home. That isn't even a big number anymore! :)
>
> I had 4 TB back in 2001. That was mighty expensive, though, and it
> sure
> wasn't a single volume. I only mention this so you don't just think
> I'm
> some punk with a RAID 5 on his TV blowing smoke :)
>
> Pat
> ----------------------------------
> CONFIDENTIALITY NOTICE: This e-mail may contain privileged or
> confidential information and is for the sole use of the intended
> recipient(s). If you are not the intended recipient, any disclosure,
> copying, distribution, or use of the contents of this information is
> prohibited and may be unlawful. If you have received this electronic
> transmission in error, please reply immediately to the sender that
> you have received the message in error, and delete it. Thank you.
> ----------------------------------
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20090120/79c84c24/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
Url : http://mail.ale.org/pipermail/ale/attachments/20090120/79c84c24/attachment.bin
More information about the Ale
mailing list