Subject: Re: Web load causes reboot
To: Frank van der Linden <frank@wins.uva.nl>
From: Dave Burgess <burgess@cynjut.neonramp.com>
List: current-users
Date: 08/03/1999 17:35:47
>
> On Tue, Aug 03, 1999 at 12:59:56PM -0700, Len Burns wrote:
> > Indeed it is. I can reproduce the same behavior here using a 1.4.1
> > kernel on i386.
>
> Could either of you give some more data, like the configurations
> of your system, the type of network that you're using, the client
> that the Apache bench program is run on (i.e. how fast is it), and
> the size of the file that is being transferred with each request?
> And perhaps the types of ethernet card that you are using.
I'm running into the same problem on mail2.
My configuration is a Web Server running 1.4 with about 200 virtual
hosts. I'm running GENERIC+"options gateway"-"DDB and EISA and PCMCIA"
on a Pentium 60 with 48Meg of memory and a WD8013 network card.
The most common cause seems to be when I'm running my noc
monitoring software. It crashes after the 4 pings (which succeed) but
once I start to probe the services I'm watching, it randomly crashes (maybe
once a week). Since I'm not running with DDB, it just boots back up
like it's supposed to, and has never generated a crash dump.
I also received the MCLPOOL warning once (while I was running GENERIC)
but all it did that time was lock the computer solid. As long as I
don't run out of NMBCLUSTERS, I don't see that error and the system
doesn't wedge.
>
> I'd love to reproduce and fix it, because it's a serious bug that
> has been plagueing people using 1.4 or later a lot.
The fact that it seems to most commonly happen to me right after the ping
might be a clue, or it might be a red herring...
Note that I have 9 other servers, and all of them are running the same
basic kernel. None of the others have this problem. Two are name
servers and one is an NFS host. All of them are managed using the
nocmonitor software, so that (alone) isn't it. Two of them are running
Apache, so that isn't (probably) it.
mail2 101> ruptime -a
admin up 45+02:25, 1 user, load 0.24, 0.27, 0.24
fax2mail up 46+22:48, 0 users, load 0.10, 0.09, 0.08
mail up 1+02:36, 6 users, load 0.41, 0.40, 0.40
mail2 up 6:12, 2 users, load 1.25, 1.15, 1.16
ns1 up 16+18:50, 1 user, load 0.12, 0.09, 0.08
quakeII up 45+02:56, 1 user, load 2.11, 2.09, 2.08
radius1 up 33+12:49, 0 users, load 0.33, 0.44, 0.34
radius2 up 62+03:23, 1 user, load 0.06, 0.19, 0.21
webserv01 up 33+12:41, 0 users, load 0.06, 0.07, 0.07
webserv02 up 8+09:49, 1 user, load 0.11, 0.08, 0.08
If you have any guesses, I'll be glad to discuss them in private E-Mail.
--
Dave Burgess Network Engineer - Nebraska On-Ramp, Inc.
*bsd FAQ Maintainer / SysAdmin for the NetBSD system in my spare bedroom
"Just because something is stupid doesn't mean there isn't someone that
doesn't want to do it...."