Subject: Re: Sun3/50 netboot progress!
To: Brad Spencer <brad@anduin.eldar.org>
From: Christopher Masto <exidor@yiff.stu.rpi.edu>
List: port-sun3
Date: 01/31/1996 20:20:50
> Hey, I am glad someone else has seen odd behavior with a 3/50 and prom
> version 2.3 [I was *really* beginning to suspect a defective machine].
> I run this Sun diskless and use a NetBSD/i386 box as a fileserver.  I
> know that the Ethernet card on the fileserver is *way* slow, and
> thought that it was perhaps related to swapping over NFS, but I don't
> know for sure.  Basically, I can always make a compile of 'tcsh' drop
> core, usually before it actually gets to compiling anything
> [i.e. which it is generating the various header files, basically a
> couple of 'egrep's and a 'sed' in a single pipeline].  After it starts
> dropping core, *any* binary which uses shared libraries will drop
> core.  Static binaries seem to work better.
> 
> Sometimes it will just start to drop core on boot [important things
> like getty and the like].

*urp* I have this exact problem.. originally it happened several times
a day, when my /home and /usr were on a Linux NFS server (but swap
and home were local).. After moving all the filesystems to a local
SCSI drive, I thought it was gone, having an uptime of 8 days.  Then
I discovered that it gets into this state very easily if I attempt a big
compile.  Building -current caused it several times, as did building
gcc 2.7.2.  Unfortunately it still happens in -current.  It would seem
likely that it's a VM/swap bug since it happens under heavy load.. or
else it's some kind of ld.so bug, since statically linked binaries still
work.

One interesting thing is that when /usr was on NFS, I was able to undo
the problem by remounting /usr (had to leave a root login around for that).
I could do that pre-1.0 because it didn't check to see if the filesystem
was already mounted.  When I couldn't do that anymore, I found that dropping
into ddb and typing "call nfs_init" (then "c") would make the problem go
away, except that after half a dozen uses of that, I'd get a lockup.

I think those could be important clues.. hopefully they will trigger an idea
for someone who is familiar with the right parts of the kernel and loader.

My machine:
Sun 3/280
8MB RAM
50MB swap