Subject: Re: twe status queries?
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 12/02/2005 16:53:56
>>> I don't know if I would run a Raid5 with 12 drives (Raid5 tolerate
>>> only one drive failure).
>> That's why we have a cold spare.
> That's not a good idea.  A common drive failure mode is to not spin
> up when power is cycled --

Good point.  Thanks for mentioning it.  We may decide to ignore the
risk (I'll have to hash it out with my boss), but it's a failure mode I
don't think we considered, and I think we should remedy that.

> RAIDframe has code for P+Q parity.  I don't know if anyone has ever
> tested it.  If it works, you can use it to build large parity RAID
> sets safely -- but the performance of two RAID 5 sets, striped, may
> be better.

Well, *more* safely - it still won't withstand three simultaneous
failures.

This is just ECC, done laterally across the bits written, and with the
advantage that you have dropouts rather than errors, bits that are
missing rather than wrong.  (That's why simple parity is error
*correcting* in this case.)  There's no reason why one couldn't use
more sophisticated ECC codes, though of course you'd need a lot of
disks to do much with it, and then you have to either restrict yourself
to codes where the protected data appears in the clear in some bits or
ignore the whole-stripe overhead for small reads.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B