Subject: Re: RAIDFrame and RAID-5
To: NetBSD-current Discussion List <current-users@NetBSD.org>
From: Frederick Bruckman <fredb@immanent.net>
List: current-users
Date: 10/27/2003 15:31:21
On Mon, 27 Oct 2003, Greg A. Woods wrote:
> [ On Wednesday, September 10, 2003 at 07:50:43 (-0600), Greg Oster wrote: ]
> > Subject: Re: RAIDFrame and RAID-5
> >
> > "Thomas Hertz" writes:
> > >
> > > I haven't been able to get a kernel core dump, since the system just
> > > freezes. I have noted that just moments before the system freezes, it's
> > > not possible to start new processes. The already running processes, will
> > > continue to run normally for some minute more. Most of the time the
> > > console prints out a few "cannot allocate Tx mbuf" for the various
> > > network interfaces just before the final freeze.
> I just this morning encountered a system freeze that appears to have
> been caused by RAIDframe.
>
> Oddly enough it recovered all by itself.
> Note this is on my development system running a 1.6.1_STABLE kernel from
> about a month ago. The system has 320MB of RAM. I have one RAID-1 set
> and two RAID-5 sets:
>
> total memory = 319 MB
> avail memory = 275 MB
> using 1000 buffers containing 32720 KB of memory
That should give the maximum of 64MB of kvm pages (16K pages of 4K
each), which sounds like plenty, but I suppose it could have become
too fragmented. Hey, you're not swapping to RAID-5 are you? That
configuration is known to cause problems.
> > You might want to add KMEMSTAT (or whatever it is) to the kernel
> > config, and then do a bunch of "vmstat -m" while causing the machine
> > to crash. That might indicate whether you're actually out of kernel
> > memory or not....
>
> I can only show you what it looks like now, i.e. a couple of hours after
> it came back to life:
> Memory resource pool statistics
> Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
> phpool 40 6772 0 6128 27 6 21 24 0 inf 0
> pcgpool 76 3195 0 3190 2 1 1 2 0 inf 0
> pmappl 68 880701 0 880597 47 45 2 10 0 inf 0
> pdppl 4096 3455 0 3351 1648 1540 108 583 0 inf 4
> vmsppl 188 880701 0 880597 86 80 6 28 0 inf 0
> vmmpepl 64 24244269 0 24243118 481 460 21 79 0 inf 0
> vmmpekpl 64 1277696 0 1276760 22 6 16 16 0 inf 0
> uaoeltpl 84 164 0 129 1 0 1 1 0 inf 0
> aobjpl 52 1 0 0 1 0 1 1 0 inf 0
> amappl 40 11247213 0 11246520 61 52 9 31 0 inf 0
> mbpl 256 19224 10295 19087 203 186 17 44 1 inf 0
^^^^^
|||
Greg ran out of MBUFS, all right.
> mclpl 2048 7827 0 7745 1872 1827 45 93 4 16384 4
> sockpl 168 438465 0 438268 83 72 11 29 0 inf 0
What I would try, is to increase NKMEMPAGES until the problem isn't
reproducable anymore. I just had to increase NKMEMPAGES to 5000 or
6000 on my 486 to get the ISA ethernet card to configure (from the
calculated default of 4096 for 64MB RAM), even though by the time I
login to view it, it's hardly using more than 1 MB, so I know that you
need a lot of headroom. This pig has never had more than a month of
uptime before getting wierd file system errors that go away on reboot,
so I'm anxious to see if the new configuration does any better.
Frederick