Subject: Re: panic while building a raid-1 set one component at a time
To: Jeff Rizzo <riz@boogers.sf.ca.us>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 10/06/2003 11:43:55
Jeff Rizzo writes:
> After noodling on this for a while, it occurs to me that the machine I'm
> building this RAID set on before deploying it has only 16MB of memory...
> Is that little enough to cause this particular issue?  

Yes, likely.  

> I know that
> raidframe is somewhat memory intensive... If so, is there anything I can
> do kernelwise to strip down the rest of the memory needs so I can get this
> set built?  

If you're diddling with a new kernel, try bumping up the NKMEMPAGES 
values (see 'man options').  (With only 16MB, the default amount of 
"dedicated kernel memory" is fairly small...)

> It's not going to live here permanently, but I'd
> sure like to get it built before moving it to its final destination...
[snip]
> On Sun, Oct 05, 2003 at 12:12:07PM -0700, Jeff Rizzo wrote:
> > I've done this before, but not for about a year, so I'm not sure
> > if I'm doing something wrong here, or what.  I'm working with a GENERIC
> > kernel circa September 28 on i386 (from the releng.netbsd.org snapshot
> > that day)
> > 
> > I've got two identical disks, and constructed half a raid-1 on one (I
> > needed the other to bootstrap from sysinst) as it says to do in the
> > raidctl man page; it seems to be working fine in degraded mode.
> > 
> > The two disks are wd1 and wd2;  wd2 is the working component of the raid
> > set; I'm trying to add wd1.  I copied the disklabel from wd2 onto wd1,
> > did a 'raidctl -a /dev/wd1a raid0', and then when I try to do the
> > 'raidctl -F component0 raid0', it panics:
> > 
> > # raidctl -a /dev/wd1a raid0
> > Warning: truncating spare disk /dev/wd1a to 488396928 blocks
> > # Oct  5 10:26:49  /netbsd: Warning: truncating spare disk /dev/wd1a to 488
> 396928 blocks
> > raidctl -F component0 raid0
> > RECON: initiating reconstruction on row 0 col 0 -> spare at row 0 col 2
> > raid0: Quiescence reached..
> > panic: malloc: out of space in kmem_map
> > Stopped in pid 399.1 (raid_recon) at    netbsd:cpu_Debugger+0x4:        lea
> ve
> > db> 
> > 
> > Now, I'm wondering about the "Warning: truncating spare disk" message;
> > I can't see anything different about the labels of wd1 and wd2, and I
> > didn't get that message when I built wd2.
> > 
> > One interesting point:  I can't seem to change the info on wd2c in the
> > disklabel;  it always returns to
> > 
> >  c:        15         0     unused      0     0        # (Cyl.      0 -    
>   0*)
> > 
> > No matter how I edit it with "disklabel", though the edits always seem to
> > take.
> > 
> > Anyway, here's the entire sequence.  I hope there's some clue in here
> > somewhere...
[snip]

Ya.. classic case of "out of kernel memory".  Probably doesn't help 
that the RAID sets are so big... (to do the reconstruction, RAIDframe
allocates tables and such to keep track of what's been done, and 
what's remaining...  with 16MB, only a small chunk of that will be 
available for the kernel, and RAIDframe can be a resource hog...)

Later...

Greg Oster