Subject: Re: ahc and raidframe questions
To: Greg Oster <oster@cs.usask.ca>
From: Greg Lehey <grog@lemis.com>
List: netbsd-users
Date: 06/24/1999 10:26:54
On Wednesday, 23 June 1999 at 16:50:21 -0600, Greg Oster wrote:
> Chris Jones writes:
>>>>>>> "Greg" == Greg Oster <oster@cs.usask.ca> writes:
>>
>>>> * Doing anything with the RAID array seems pretty slow.  It took
>>>> over two hours to initialize the parity information for a RAID5
>>>> array across three 9G disks.  It's currently doing a 17GB newfs,
>>>> and it looks like that will take about an hour by the time it's
>>>> done.
>>
>> Greg> Given that it's got to read 18 GB of data and write 9GB of data,
>> Greg> it's going to take a little while to re-write parity, even with
>> Greg> fast disks...  I'm not sure if 2 hours is unreasonable or not,
>> Greg> as I've never worked with a RAID set that big with RAIDframe,
>> Greg> nor have I used UW controllers/drives (yet). (I've thought about
>> Greg> how nice it would be, but that's it :) )
>>
>> Yeah, it's a good point.  But once the filesystem is there, it writes
>> files at about 1.7 MB/s.  This seems really slow to me.  Of course, I
>> might just have over-inflated expectations, but UW SCSI is supposed to
>> be fast.
>
> Ick... 1.7MB/sec does seem quite slow...
>
> I just built a RAID 5 set over 1 IBM disks and 1 HAWK (2 controllers,
> all fast-narrow drives, CPU is a P133) and got the following from Bonnie:
>
>               -------Sequential Output-------- ---Sequential Input-- --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
>           500   935 31.6  1206  8.6   910  5.8  2806 93.9  3960 13.8  36.1  5.3
>
> which isn't exactly spectacular, but these arn't the world's speediest disks.
> I'd have expected the faster disks to do better, even on a P120.
> (I've got an idea as to why this is slow, and it's related to a "fix" put into
> the driver to ensure that it doesn't try to eat up all the kernel memory (and
> thus panic the system).  I'll send you additional info in a separate email.)

Bonnie's not the best tool for measuring storage system performance,
since it goes via buffer cache.  You might like to try rawio
(ftp://ftp.lemis.com/pub/rawio.tar.gz), which goes to the raw device.
I'd be interested in seeing the results.

>> Greg> [snip]
>>>> This is an i386, 1.4 system.
>>
>> Greg> What CPU?
>>
>> It's a Pentium 120MHz.  RAIDFrame isn't *that* CPU intensive, is it?
>> I'll check in a few minutes here, I guess.  I can run top while doing
>> stuff to it.
>
> If you change a single bit on a stripe, it's got to do a minimum of 2 reads
> and 2 writes.  If your stripe width is 32 blocks (16K), it has to read 32K and
> write 32K, just to change 1 bit.  (never mind the filesystem overhead).
> Yes, it can be CPU intensive, especially on slower CPUs.

Is this correct?  Under these circumstances, Vinum would perform
single-sector reads and writes, regardless of the stripe size.

My experience with Vinum has been that the parity calculations are not
an issue.  The "read 2, write 2" I/O definitely is.

>> Greg> It could also be that the stripe width (or sectors per stripe
>> Greg> unit) are not optimal for those drives/your machine.
>>
>> Yeah.  I was trying to avoid experimenting with it, but I guess that's
>> what I'm going to do now.  I'll use small partitions, I think.  :)
>
> Yup, small partitions for the experimentation are the way to go...

This isn't directly applicable to RAIDframe, but you might like to
take a look at http://www.lemis.com/vinum.html and subpages, notably
http://www.lemis.com/vinum/Performance-issues.html.  Note that the
performance measurements were done on very old disks; it's the
relationship to a single disk, not the absolute performance, that
counts.  And yes, it appears that these relationships scale nicely to
modern disks.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key