Subject: Re: Raidframe experiments and scsi woes
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 11/25/1998 17:25:09
Manuel Bouyer writes:
> 
> Hi,
> I've played a bit with raidframe today. Here are a few results:
> I used 3 Ultra-wide FUJITSU 8Gb disks on a PII-400Mhz. 2 of the disks where o
> n
> a ahc2940, the last one on a ncr875.

Drool :-)  Wish I had disks like that for RAID testing :-) 

> Here are the results I got from bonnie:
> sd5 is the result for a single drive on the ahc, ccd0 and ccd1 for a ccd
> between the 2 drives of the ahc with an interleave of 32 and 171,
> respectively; and raid5_0 and raid5_1 for a raid5 array with the 3 drives,
> with an interleave of 32 and 171 respectively. raid4_0 and raid4_1 are
> the same tests for a raid4 array.
> 
>               -------Sequential Output-------- ---Sequential Input-- --Random
> --
> 	      -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %C
> PU
> sd5       100  8320 36.0  8074  7.4  1302  1.8  8332 37.8  8319  6.7  89.0  1
> .1
> ccd0      100  2335 10.4  2335  2.3  1553  1.8 10805 50.6 11441  7.1 117.5  1
> .3
> ccd1      100  5936 25.9  5691  5.2  3210  3.8 13823 63.6 14734  8.6 114.9  1
> .2
> raid5_0   100  2218 11.7  2137  3.8  1264  2.4  5961 29.7  5152  4.5 114.0  2
> .1
> raid5_1   100  2003  9.7  1573  2.2   773  1.0  7417 35.4  8325  5.9 106.5  2
> .0
> raid4_0   100  1912 10.1  1981  3.6  1308  2.5 10859 53.8 11625  9.9 115.0  2
> .1
> raid4_1   100  1612  7.8  1287  1.8  1543  2.1 10498 49.9 15265 10.7 112.8  2
> .1

Very interesting... 

> raid5 seems to suffer from bad performances, but I already noticed this
> with "external" raid5 arrays. 

I'm surprised RAID5 is doing so poorly on reads... In your RAID config
file, what does your "fifo" line look like?  Have you tried playing with that?

> With raid4 it seems possible to achieve
> performances close to ccd for reading, 

As long as the set is not running in "degraded" mode, it doesn't read the 
parity blocks, and thus can be quite quick.....

> but writing is worse than
> raid5 ...

That's because all of the parity bits are going to the n'th disk, and making 
that disk a bottleneck... 

[SCSI bus hangs...]

I'm hoping that at some point RAIDframe will have some way of better 
communicating with the underlying components.  Of course if the SCSI
bus hangs, then there's not much RAIDframe can do :-( 

> These behaviors prevent any king of hotswapping, and raidframe would be much
> more useable if these were fixed. The ahc case seems to be the more simple
> to fix. Maybe a look at the FreeBSD driver would help ?
> Unfortunably I only have temporary access to this test box, so I
> will not be able to address this in the near future.
> 
> Now a few comments about the degraded mode of raidframe:
> After powering off one of the drive, the scsi command timed out and
> the console got flooded with messages like:
> Nov 25 13:06:26 pr7 /netbsd: DEAD DISK BOGUSLY DETECTED!!
> Nov 25 13:06:26 pr7 /netbsd: [0] node (Rop) returned fail, rolling backward
> Nov 25 13:06:26 pr7 /netbsd: [0] node (Rrd) returned fail, rolling backward
> Nov 25 13:06:26 pr7 /netbsd: [0] DAG failure: w addr 0xe2b0 (58032) nblk 0x80
>  (1
> 28) buf 0xf6268000
> 
> I guess these comes from raidframe.

Yup... RAIDframe gets pretty verbose when it can't find the data it wants.. :-(

> Marking the component as failed doesn't
> seem to help. The 2 other disks have a lot of activity, but the bonnie proces
> s
> I had is stuck on "getblk".
> After a reboot,  raidframe cam up with a component marked "failed" (the
> disk was still off). I powered on the disk and issued a rescan of the bus.
> Then I found no way to ask raidframe to recontruct the data of this disk.
> I had to mark it as spare in my config file, unconfig/reconfig raid0
> and issue a raidctl -F. I think it would be nice to be able to ask raidframe
> to rebuild a disk directly, for configurations without spares.

I agree that the procedure for doing this is not very easy..  Direct rebuilds 
are on my "todo" list...

> When configuring raid0 with a non-existent spare, the config fails, but
> the component are still marked busy. After fixing the config file,
> any raidctl -c would fail because of this. I had to reboot.

You should have been able to "raidctl -u raid0", and then re-configure... 
If something is still marked busy after a config failure, then I've still got 
a bug in there somewhere.. (sounds like I do, and I think I know where...)

> Also, I think a 'raidctl -r' should immediatly fail on a device with failed
> components.

Yup, it probably should... (it can't reconstruct the parity anyways, as it's
missing the data blocks from the dead component)

> For some reason, the box gets hung for a few second when doing
> this.
> 
> However, even with these SCSI issues, raidframe looks really usable.

Cool! :-)

> Good work !

Thanks...  And thanks for the report...

Later...

Greg Oster

oster@cs.usask.ca
Department of Computer Science
University of Saskatchewan, Saskatoon, Saskatchewan, CANADA