Subject: Re: raidframe problems (revisited)
To: Matthias Scheler <tron@zhadum.org.uk>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 06/01/2007 11:02:07
Matthias Scheler writes:
> On Mon, May 28, 2007 at 04:50:03PM -0600, Greg Oster wrote:
> > With the array in degraded mode, can you mount /dev/wd1a (or 
> > equivalent) as a filesystem, and run a series of stress-tests on 
> > that, at the same time that you stress the RAID set?  Something like:
> > 
> >   foreach i (`jot 1000`)
> >   cp src.tar.gz src.tar.gz.$i && rm -f src.tar.gz.$i & 
> >   sleep 10
> >   dd if=/dev/zero of=bigfile.$i bs=10m count=100 && rm -f bigfile.$i &
> >   sleep 10
> >   dd if=src.tar.gz.$i of=/dev/null bs=10m &
> >   end
> 
> I've modified the above like this:
> 
> #/bin/sh
> for i in `jot 1000`
> do
> 	cp src.tar.gz src.tar.gz.$i & 
> 	sleep 10
> 	(dd if=/dev/zero of=bigfile.$i bs=10m count=100 && rm -f bigfile.$i) &
> 	sleep 10
> 	(dd if=src.tar.gz.$i of=/dev/null bs=10m && rm -f src.tar.gz.$i) &
> 	wait
> done
> 
> I turned the unused RAID spare disk into a filesystem and ran the stress
> test on the degrade RAID and the spare disk for over an hour. The machine
> survived that stress test without any problems.
> 
> BTW: could this problem be related to the size of the disk? The RAID 1
>      in question uses to 250GB IDE disks.

No, not likely...  My desktop here has dual 250GB IDE disks and it's 
been running them for ages now.  I built a new box the other day with 
dual 320GB SATAs, and it's working just fine too...

Could you send me your dmesg, kernel config file, RAID config file, 
and disklabels?  (privately, if you wish)

You say the machine freezes -- it does that when you attempt to do a 
'raidctl -F' or 'raidctl -R', yes?  Can you get into ddb at that 
point?  Can you ping the box at that point?  My next guess is that 
it's a kernel memory issue of some sort... 

Later...

Greg Oster