Subject: Crash/hang - getsockname busted?
To: None <port-macppc@netbsd.org>
From: Donald Lee <donlee_ppc@icompute.com>
List: port-macppc
Date: 10/03/2001 17:43:40
Does this look familiar to anyone?  Is it "fixed" in 1.5.2?

Machine is Power Computing Power Center 132 w/ 300 Mhz G3 card in it.
It's running NetBSD 1.5 kernel with a few changes, including
MAXUSERS set to 64, some CONFIG options to enable backside cache,
a couple of changes for Cyclades 8 port serial driver, and a fix to
extintr.c for "lost" soft interrupts.

It runs production web/file server - apache and netatalk, sshd, ftpd,
etc.  Nothing exotic.  The web server gets about 500K hits a month.

It runs fine for months, but twice now since January, it has "bogged down"
to the point where it appears that nothing can fork.  I can't log in,
and any existing sessions stop responding.  Ping still works, and
telnet to the box gets a connection, but no "login:" prompt.

On the serial console, I can get the "login", but if given a name, the
"passwd:" prompt never appears. The web server appears to continue to
serve pages for a while, but eventually stops responding.  (This time
around I didn't wait that long)

The only evidence of trouble is in the Apache error log.  There is nothing
interesting in the system log.

(this is the entire log for today)
>[Wed Oct  3 00:00:12 2001] [notice] Apache/1.3.14 (Unix) configured -- resuming
>normal operations
>[Wed Oct  3 16:30:03 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:38:20 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:39:30 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:46 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:47 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:48 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:49 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:49 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:49 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:49 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:49 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:50 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:50 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:50 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:50 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:44:51 2001] [error] (22)Invalid argument: getsockname
>[Wed Oct  3 16:51:20 2001] [warn] pid file /usr/local/apache/logs/httpd.pid over
>written -- Unclean shutdown of previous Apache run?
>[Wed Oct  3 16:51:20 2001] [notice] Apache/1.3.14 (Unix) configured -- resuming
>normal operations

Ideas, suggestions, flames - all appreciated.

Thanks,

-dgl-