Subject: Re: ahc and raidframe questions
To: Curt Sampson <cjs@cynic.net>
From: Greg Lehey <grog@lemis.com>
List: netbsd-users
Date: 06/27/1999 13:03:05
On Saturday, 26 June 1999 at 21:56:18 -0400, Curt Sampson wrote:
> On Sun, 27 Jun 1999, Greg Lehey wrote:
>
>> A lot of this depends on what you're trying to do.  RAID-3 is good for
>> things like video which require high transfer rates with sequential
>> access, but it won't help much for massively concurrent applications
>> such as web and ftp servers, since positioning takes too long.
>>
>> I still don't have an opinion about the difference in performance
>> between "software" RAID and "hardware" RAID.  They're both software
>> RAID, of course: the real difference is just where the software gets
>> executed.
>
> Perhaps I didn't make myself clear. The whole point of an external
> hardware RAID box is that it's got specialised hardware and software
> for dealing with a RAID array that is hard to duplicate in software
> RAID on a host. Parity calcuations are not a big deal, I know, but
> here are some other advantages (assuming the hardware RAID is done
> right, of course):
>
>     * Use of a large, battery-backed cache to group your writes
>       lets you use RAID-3 instead of RAID-5 for faster write
>       throughput while helping to avoid the small write penalty
>       inherent with RAID-3 (and making sure that your large-write
>       applications perform better than they ever could with RAID-5).

As I mentioned already, this depends on your application.  My
understanding is that ftp and web servers are currently more important
than streaming video.  RAID-3 doesn't do well on ftp and web servers.

>     * You use spindle sync means that to reduce your average time
>       to access data after a seek from something close to a revolution
>       to a half-revolution.

It seems my explanation in my last message didn't come across too
well.  OK, let's compare this in a web server application (reading),
with a 5 disk array.  You can expect transfers in the order of 16 kB.
Your latency is in the order of 8 ms, and the transfer time at, say,
40 MB/s (4 drives, each 10 MB/s) is 250 µs.  Total 8.25 ms--*per
drive*.  If you look at it in terms of overall disk use, that's 33 ms.
By contrast, with RAID-5 on the same 4 drives, you'll typically read
from only one drive, unless you fall into the trap of making your
stripes too small.  In this case, your latency is still 8 ms, and it
will take you four times as long to transfer--1 ms, making a total of
9 ms.  This is the total disk use, nearly 75% better than RAID-3.

Writes are more complicated.  On RAID-3, you need one read and one
write, including the parity drive.  Now your disk use for the first
(read) transfer is 5 * 8.25, or 41.25 ms.  Assuming no other transfer
comes between the read and the write, the latency for the second
transfer is approximately one revolution.  At 10,800 rpm that's about
5.5 ms, so the second transfers take a total of 5.75ms per drive, or
28.75 ms, making a total of 70ms.  By contrast, the RAID-5 takes
2*8.25 for the read and 2*5.75 for the write, for a total of 28 ms of
disk time.

This assumes, of course, that you have lots of unrelated transfers
waiting, and that each transfer will cause positional latency.  This
is valid for web and ftp servers, but not for single-application
streaming video, where RAID-3 is better than RAID-5.

>     * You have two controllers in the box, and two controllers on
>       the host, so that if a controller fails on either the box or
>       the host, or a cable fails, things keep running.

If you have two controllers on the host, you can do that in software
RAID too.  In fact, with hardware RAID you have more components, so
the chance of one of them failing is higher.  I don't see this as an
argument for a RAID box.

>> I'd be interested in collecting them.  Again, I'd
>> plug my rawio program (ftp://ftp.lemis.com/pub/rawio.tar.gz), which
>> bypasses buffer cache and thus gives more accurate results for the
>> underlying storage equipment.
>
> You should add this to pkgsrc. I currently use bonnie because I
> prefer to test file-system I/O rather than raw I/O speeds, but now
> that we're about to get a unified buffer cache, and I have a 512
> MB sparc, I don't think I'm going to have the patience to do bonnie
> with a big enough file to know throughput.  :-)

Right.  I've already sort of promised.  But the tarball should build
out of the box.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key