[ale] ATL Colocation and file server suggestions
Jeff Lightner
jlightner at water.com
Tue Jan 20 13:26:23 EST 2009
With RAID5 you're still at risk from losing 2 drives and moreover it is
ANY 2 drives. At least with RAID10 you have to lose 2 specific drives
at the same time. You save nothing but space with RAID5 configuration
and your risk increases.
Of course for real paranoia one could use RAID15. Even if your array
doesn't do it you might be able to achieve it by using hardware RAID5
mirroring to prevent two LUNs to the system then use software RAID1
mirroring to use both LUNs as mirrors. (Or maybe that's RAID 51 - it's
to early in the week for me to think about it...)
http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel15-c.html
Of course that says RAID51 is very inefficient.
________________________________
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of Ken
Ratliff
Sent: Tuesday, January 20, 2009 12:56 PM
To: ale at ale.org
Subject: Re: [ale] ATL Colocation and file server suggestions
On Jan 20, 2009, at 8:58 AM, Jeff Lightner wrote:
Seeing two mirrors drop out in RAID10 shouldn't be an issue so
long as
you have two spares because it should be able to recreate both
mirrors
as it is both mirrored and parity striped. (Losing two in
RAID1 would
of course be an issue [assuming you didn't have a 3rd mirror]).
What
hardware array was being used that dropping two mirrors caused a
problem
for RAID10?
Maybe I didn't say that clearly enough - I've seen a RAID 10 lose both
drives of a single mirror (in other words, the stripe set was broken,
total data loss, chaos, screaming, and finally a restore from backup!)
multiple times. And yes, I felt like God hated me, along with a few
other supernatural entities.
I'll have to say that even if having two mirrors drop in RAID10 were a
problem it must be a sign that God hates the person that had it if the
two drives that failed (assuming it was only two) happened to be the
ones that were mirrors of each other. We've been using RAID10 for the
last 4 years since our last major RAID5 fiasco and have not seen such a
failure.
Hehe, I've noticed that RAID level preference seems to be alot like OS
preference. You are shaped by your experiences and react based on them.
I don't trust RAID 10 (though I'll use it when I can't use RAID5, ie for
a database server) and I've never had that much of a problem with RAID5.
The closest thing was a 6TB array in which a drive died and another one
was showing DRIVE-ERROR, but it made it through the first rebuild to
replace the dead drive, and then through the second to replace the one
showing error like a champ.
If I had my way, I'd yank the drives out of the servers entirely, just
build a nice sexy SAN and export disks to the servers via iSCSI. But
that's.... expensive.
-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf
Of Pat
Regan
Sent: Tuesday, January 20, 2009 2:56 AM
To: ale at ale.org
Subject: Re: [ale] ATL Colocation and file server suggestions
Ken Ratliff wrote:
However, software RAID 1, 10 is excellent and
performance compatible
with a hardware card.
I still prefer to do RAID 10 on hardware. I've found
software raid to
be pretty finicky, drives dropping out of the array for no good
reason, and you don't notice it until the slight
performance hit for
the rebuild makes you go 'hm.'
If drives are randomly dropping out of Linux software RAID
something is
wrong. I can only recall having two machine that had random
drives
dropping out of MD devices. Both showed errors when running
memtest86.
One was a bad CPU, I believe the other was bad RAM (IIRC).
I actually don't like RAID 10 at all. I'd rather toss the 4
drives
into a RAID5 and get more space. Sure, a RAID 10 will
allow you to
survive 2 dead drives, as long as it's the right 2
drives. I've seen
both drives of one mirror fail in a RAID 10 a few times,
and that has
pretty much the same result as 2 dead drives in a RAID 5.
Redundancy isn't the first reason to choose RAID 10 over RAID 5.
If it
were, everyone would just choose RAID 6 since that would let you
lose
any two drives.
RAID 5 has a terribly write performance problem. Doing a random
uncached write to a RAID 5 involves a read and a write to one
stripe on
every drive.
Software RAID1 I have no problem with though. It's quick, easy,
the
performance hit is negligible unless you have something
that's really
pounding the disk i/o and as someone else mentioned, being able
to
split the mirror and use them as fully functional drives
does
occasionally have it's uses.
Hardware RAID 1 shouldn't have a write performance penalty.
Software
RAID 1 (or 10) requires double the bus bandwidth for writes. I
can't
speak for all implementations, but Linux MD RAID 1 spreads reads
out
over all drives in the raid set.
Yeah, we found out the hard way that software RAID5 is a very
very bad
idea, especially if you're running it on a high activity web
server.
After enough times of having a drive in software raid5
die before
you're done rebuilding from the previous drive failure,
you kind of
learn that maybe this isn't such a good idea (or you
tell the night
crew to turn apache off so that the array can rebuild in
peace, but
that's not something properly spoken of in public!). The
performance
hit for software RAID5 just isn't worth implementing it.
Your slow rebuilds likely had nothing to do with the performance
of
software RAID 5. I would imagine you needed to tweak
'/proc/sys/dev/raid/speed_limit_min' up from the default of
1MB/sec.
There is very little reason for a hardware controller to beat
Linux MD
at RAID 5, especially on modern hardware. It only requires one
more
drive worth of bus bandwidth than a hardware controller would
require.
Processors have always been able to compute parity faster than
current
hardware cards. dmesg on my laptop tells me that I can compute
RAID 6
parity at 2870 megabytes per second.
I am not saying software RAID is for everyone. It has other
advantages
besides cost, but if you have a large budget those advantages
aren't as
useful :).
Now with that being said, no form of RAID is truly safe. I had a
server today drop both drives in one of it's RAID1's.
They were older
36 gig SCSI's, so it was about time anyway, but losing both of
them
meant I got to spend time flattening the box and
reinstalling it. This
is also why I try to avoid using drives from the same
manufacturer and
batch when building arrays, as well. If you don't, you better
pray to
god that the rebuild completes before the next one dies. It's
said
that RAID is no substitute for a proper backup, and
that's true. (And
my life being somewhat of an essay in irony, the box that
dropped both
drives in the mirror today was being used as a backup server.)
This paragraph reminded me of three(!) things. RAID is not a
backup.
You know this, but lots of people don't.
Second, have you ever had the annoyance of replacing a failed
drive with
another of the same make/model and the replacement drive is in
fact
smaller than the failed drive? Whenever I use software RAID I
sacrifice
a few percent off the end of the drive just to keep this from
happening.
Which reminds me of another aspect of software RAID that has
been
helpful for me in the past. Software RAID is managed at the
partition
level. On smaller boxes I've often set up a relatively small
RAID 1 or
10 at the front of the drives and RAID 5 on the rest.
My home media server is set up like this, and so is my person
Xen host
that I have in a colo. I'm very budget conscious when I spend
my own
dollars. The Xen server has an uptime of 300 days right now
with no
RAID failures :).
(Also, I'm not preaching at you, Jim, I'm sure you know all this
crap,
I'm just making conversation!)
I like conversation, and I could just as easily be on your side
of this
one. :)
RAID 1 recovery is substantially quicker and drives
are low cost enough to not need the N-1 space of
RAID 5.
All depends on your storage needs. We have customers
with 4 TB arrays,
6 TB arrays, and one with an 8.1 TB array (which presents some
interesting challenges when you need to fsck the
volume.... why we
used reiser for the filesystem on that array, I have no
idea). Those
are a little hard to do in RAID 1 :)
I'm up near 4 TB at home. That isn't even a big number anymore!
:)
I had 4 TB back in 2001. That was mighty expensive, though, and
it sure
wasn't a single volume. I only mention this so you don't just
think I'm
some punk with a RAID 5 on his TV blowing smoke :)
Pat
----------------------------------
CONFIDENTIALITY NOTICE: This e-mail may contain privileged or
confidential information and is for the sole use of the intended
recipient(s). If you are not the intended recipient, any disclosure,
copying, distribution, or use of the contents of this information is
prohibited and may be unlawful. If you have received this electronic
transmission in error, please reply immediately to the sender that you
have received the message in error, and delete it. Thank you.
----------------------------------
_______________________________________________
Ale mailing list
Ale at ale.org
http://mail.ale.org/mailman/listinfo/ale
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20090120/1791f6cb/attachment-0001.html
More information about the Ale
mailing list