Subject: Re: ahc and raidframe questions
To: Greg Lehey <grog@lemis.com>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 06/23/1999 21:13:07
Greg Lehey writes:
> On Wednesday, 23 June 1999 at 16:50:21 -0600, Greg Oster wrote:
> >
> > I just built a RAID 5 set over 1 IBM disks and 1 HAWK (2 controllers,
> > all fast-narrow drives, CPU is a P133) and got the following from Bonnie:
> >
> >               -------Sequential Output-------- ---Sequential Input-- --Rand
> > om--
> >               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seek
> > s---
> > Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec 
> > %CPU
> >           500   935 31.6  1206  8.6   910  5.8  2806 93.9  3960 13.8  36.1 
> > 5.3
> >
> > which isn't exactly spectacular, but these arn't the world's speediest disk

I re-ran Bonnie with a kernel without the panic "fix", and got:
   -------Sequential Output-------- ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
  MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  500  1822 63.8  1774 10.4  1226  8.0  2803 93.7  3963 13.7  36.5  5.3

So it appears that the "fix" (essentially forcing synchronous writes) *is*
killing performance more than I thought... (but much less than having a box 
that panics!! :-/ )  

Sigh... Guess I'd better fix the problem properly ASAP (it's high on my list
of things to fix, but not high enough, apparently..  :-( )

> 
> Bonnie's not the best tool for measuring storage system performance,
> since it goes via buffer cache.  You might like to try rawio
> (ftp://ftp.lemis.com/pub/rawio.tar.gz), which goes to the raw device.
> I'd be interested in seeing the results.

On the same raid set (with the panic "fix" off), using:
 rawio -a /dev/rraid0e 

           Random read  Sequential read    Random write Sequential write
ID          K/sec  /sec    K/sec  /sec     K/sec  /sec     K/sec  /sec
anon       1932.3   123   6518.6   398     539.9    33    1127.2    69  
  
 rawio -a -v 1 /dev/rraid0e
Test    ID           K/sec          /sec %User    %Sys  %Total
RR     anon         1932.9          123    0.2     6.8     7.0  7752
SR     anon         6559.6          400    0.1    22.0    22.2  16384
RW     anon          544.5           33    0.1     4.0     4.1  600
SW     anon         1129.3           69    0.0     5.3     5.4  16384

... a slightly different view of the world.. and with the panic "fix" on, 
it'll be worse.  (64K stripe width (sectors per stripe unit), RAID 5)

> > If you change a single bit on a stripe, it's got to do a minimum of 2 reads
> > and 2 writes.  If your stripe width is 32 blocks (16K), it has to read 32K 
> > and
> > write 32K, just to change 1 bit.  (never mind the filesystem overhead).
> > Yes, it can be CPU intensive, especially on slower CPUs.
> 
> Is this correct?  Under these circumstances, Vinum would perform
> single-sector reads and writes, regardless of the stripe size.

No, you're quite right... RAIDframe is much brighter here (at the 
implementation level) than I've given it credit for being (it goes to a
fair bit of pain to figure out which sectors it really needs to deal with).

> My experience with Vinum has been that the parity calculations are not
> an issue.  The "read 2, write 2" I/O definitely is.

Well, on the Sun 3 I did testing on, the parity calculations seemed to matter
more than they do on the P133 ;-)

> This isn't directly applicable to RAIDframe, but you might like to
> take a look at http://www.lemis.com/vinum.html and subpages, notably
> http://www.lemis.com/vinum/Performance-issues.html.

No, it's not RAIDframe, but it *is* worth reading if you're interested in RAID 
stuff...

Later...

Greg Oster