Subject: Figuring out reason for crashing
To: None <port-macppc@netbsd.org>
From: John Klos <john@sixgirls.org>
List: port-macppc
Date: 08/10/2003 23:54:55
Hello,

For most of this year I have tried to figure out why a colocated PowerMac
running release was crashing. When it did crash, it exhibited strange
behaviour; it would still route IPv4 (but not IPv6), and was obviously
still pingable, but nothing had been indicated on the serial console, nor
did it or would it drop into the debugger.

This kept happening, but only after a week to a month, so I could not
figure out what was going on. And since I could only have it pwoer cycled
remotely, dmesg on subsequent boot did not show anything.

So I brought it home and started it doing a bulk package build, and after
four or five days, it finally crashed. I was able to ping it, but nothing
on the serial console (same as before), but I was able to reset it via an
ADB keyboard, and I got this from dmesg:

(about 120 lines of file: table is full)
...
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
set{u,g}id pid 18185 (netstat) was invoked by uid 0 ppid 18184 (shlibsign) with fd 0 closed
set{u,g}id pid 18186 (netstat) was invoked by uid 0 ppid 18184 (shlibsign) with fd 0 closed
set{u,g}id pid 18187 (netstat) was invoked by uid 0 ppid 18184 (shlibsign) with fd 0 closed
e is full - increase kern.maxfiles or MAXFILES
file: table is full -4)
panic: trap
Begin traceback...
0xdd96fd70: at trap+8c0
0xdd96rease kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase kern.maxfiles or MAXFILES
file: table is full - increase NetBSD 1.6.1_STABLE (ANDROMEDA-$Revision:
1.626 $) #9: Wed Aug  6 04:17:33 EDT 2003
    john@andromeda.ziaspace.com:/usr/src/sys/arch/macppc/compile/ANDROMEDA
total memory = 1280 MB
avail memory = 1163 MB
using 2048 buffers containing 65636 KB of memory


Does ANYONE have a clue what might be causing this, and why the system
"crashes" the way it does (as opposed to totally dying and/or dropping to
the debugger)? Of course, I'll be checking to see why I had so many open
files, but none of my other bulk build machines ever have this problem,
and this machine crashes from just being moderately busy (not necessarily
from doing bulk package builds).

A related note: I cannot compile a kernel with:
options        DEBUG           # expensive debugging checks/support

When I try, I get:

ro-length -Wpointer-arith -Wmissing-prototypes -Wstrict-prototypes
-Wno-uninitialized  -Dmacppc -I.  -I../../../../arch -I../../../..
-nostdinc -DNMBCLUSTERS="0x4000" -DNVNODE="0x11170" -DNEWPMAP -DDIAGNOSTIC
-DDEBUG -DMAXUSERS=256 -D_KERNEL -D_KERNEL_OPT   -c
/usr/src/sys/arch/macppc/compile/ANDROMEDA/../../../../nfs/nfs_serv.c
/usr/src/sys/arch/macppc/compile/ANDROMEDA/../../../../nfs/nfs_serv.c: In
function `nfsrv_symlink':
/usr/src/sys/arch/macppc/compile/ANDROMEDA/../../../../nfs/nfs_serv.c:2145:
Unable to find a register to spill.
(insn 2866 2858 2872 (set (mem:SI (reg/v:SI 3 r3) 0)
        (reg:SI 9 r9)) 423 {movsi+1} (insn_list:REG_DEP_OUTPUT 2852
(insn_list:REG_DEP_OUTPUT 2858 (insn_list:REG_DEP_ANTI 2844
(insn_list:REG_DEP_ANTI 2848 (insn_list:REG_DEP_ANTI 2854 (insn_list 2842
(insn_list 2865 (nil))))))))
    (expr_list:REG_DEAD (reg:SI 9 r9)
        (nil)))
*** Error code 1

Stop.
make: stopped in /usr/src/sys/arch/macppc/compile/ANDROMEDA


Help!

Thanks,
John Klos
Sixgirls Computing Labs