NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Help with low raid5 performance



On Mon, 10 Jan 2011 12:03:37 +0000 (GMT)
Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:

> On Mon, 15 Nov 2010, Greg Oster wrote:
> [snip]
> > Sorry that I'm chiming in a bit late at this point, after Ian
> > Clark already pointed out what is most likely the culprit...
> >
> > In your initial config:
> >
> >> START layout
> >> 64 1 1 5
> >
> > this says that the stripe width is 64 blocks...  With 2 data blocks
> > and 1 parity block in each stripe, that gives you a total of 128
> > blocks of data in a stripe.
> >
> > When you did this:
> >>  # gpt show raid0
> > ...
> >>          64  1953546015      1  GPT part - NetBSD UFS/UFS2
> >
> > You basically aligned the partition on the half-stripe which, I
> > believe, ends up in having a whole bunch of the filesystem aligned
> > on half-stripes.  E.g. every 64K write you do ends up straddling
> > two stripes, causing the read-modify-write small-write penalty for
> > every 64K write.
> >
> > If you re-align that partition to be 128 blocks from the start of
> > the RAID set, it should perform significantly better.
> 
> OK, I get this, but I'm having problems complying with the stripesize 
> heuristic in:
> 
> http://mail-index.netbsd.org/current-users/2008/08/29/msg004215.html
> 
> Namely, (($stripewidth / 2) * ($disks - 1)) <= MAXPHYS
> 
> I'm using a 4-disk RAID 5 with the partition starting at 45416448
> (which is very well aligned (on a 65536 boundary!).
> 
> What's the best stripe size to use here? 64k/3*2=42.7k, so use the
> next lowest 'nice' number, i.e. 32k?

The problem is that with a 4-disk RAID 5 set you have (effectively) 3
data disks.  As you can't get a nice power-of-two number to divide
evenly by 3, so it doesn't matter too much what you pick -- performance
is going to be greatly hindered by the small-write problem (as for a
MAXPHYS write, the best you can do is one full-stripe write, and one
partial-stripe write.)

I'd try both:

32 1 1 5

and

16 1 1 5

and see which one gives you better peformance with the filesystem you
put on it....  You might even try '8 1 1 5' or '64 1 1 5' (yes, larger
than MAXPHYS, but I think it'll work) just to see how those perform
too.

Later...

Greg Oster


Home | Main Index | Thread Index | Old Index