Subject: NetBSD/sparc diskless problems
To: None <,>
From: der Mouse <mouse@Lightning.McRCIM.McGill.EDU>
List: current-users
Date: 08/10/1994 15:01:27
I've got a SPARC here running NetBSD.  The hope is to run NetBSD on our
diskless SPARCs, so I've been working with the diskless support.  And
of course I have a problem.  (Details of code currency and machine
environment at the end of this note.)

Essentially, it hangs at a random time.  Sometimes this is as soon as a
couple of lines of output after the fsck finishes (there's disk on it,
and I have it set to mount some of it even when it boots diskless);
once it got far enough to print a login prompt on the console before
going catatonic.  Usually it's somewhere in between - but not at a
repeatable place.

I've tried investigating some, but learning enough about the diskless
support to direct the investigations intelligently would take quite a
lot of time, hence this note.  I have ddb in the kernel, but it does
not appear to know about the kernel's symbol table, which makes stack
traces less than wonderful.  (On looking at the ddb code, it appears to
know how to deal with symbol tables, but for whatever reason it isn't
recognizing or printing any symbol names.)  The hang is fairly soft;
L1-A breaks to ddb (or the ROM monitor, before I started building ddb
into the kernel).

I used another Sun on that same Ethernet segment to watch the network
traffic, and nothing sticks out as significant.  The boot server (a
4.1.2 SunOS machine) keeps asking the NetBSD machine for its mount RPC
port, which the NetBSD machine keeps returning failures for, but that
starts well before the point of the hang.

Would anyone care to point me at useful place to start investigating?
My inclination at this point is to either look at ddb's symbol table
code in more detail (and perhaps add a way for user-land to spoon-feed
it the kernel symbol table - an LKM containing nothing but a symbol
table maybe), or else resort to writing down the stack trace in hex and
working out symbol offsets myself, and hope the stack trace points me
in a useful direction.

The machine is a SPARCstation IPC.  The kernel is built from sources
up-to-date as of a day or two ago; user-land is from the SPARC binary
tars of August 1st.  The diskless root area was created by cloning the
diskful root+usr tree and tweaking things like /etc/fstab.  Machine
configuration as indicated by boot messages (from a diskful boot;
diskless boot is identical except for the boot command used):

note: lost 33 pages in translation
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.

NetBSD 1.0_BETA (CALLISTO) #0: Mon Aug  8 13:30:18 EDT 1994
real mem = 11993088
avail mem = 10194944
using 146 buffers containing 598016 bytes of memory
mainbus0 (root)
cpu0 at mainbus0: SUNW,Sun 4/40 (MB86900/1A or L64801 @ 25 MHz, WTL3170/2 FPU)
cpu0: 65536 byte write-through, 16 bytes/line, sw flush cache enabled
memreg0 at mainbus0 ioaddr 0xf4000000
clock0 at mainbus0 ioaddr 0xf2000000: mk48t02 (eeprom)
timer0 at mainbus0 ioaddr 0xf3000000
zs0 at mainbus0 ioaddr 0xf1000000 pri 12, softpri 6
zs1 at mainbus0 ioaddr 0xf0000000 pri 12, softpri 6
audio0 at mainbus0 ioaddr 0xf7201000 pri 13, softpri 4
auxreg0 at mainbus0 ioaddr 0xf7400003
sbus0 at mainbus0 ioaddr 0xf8000000: clock = 25 MHz
dma0 at sbus0 slot 0 offset 0x400000: rev 1
esp0 at sbus0 slot 0 offset 0x800000 pri 3: ESP100A, clock = 25 MHz, ID = 7
tg0 at esp0 target 3
sd0 at tg0 unit 0: SEAGATE ST41600N 0030, 2676846 512 byte blocks
sd0: <SEAGATE ST41600N cyl 1966 alt 2 hd 17 sec 80>
le0 at sbus0 slot 0 offset 0xc00000 pri 5: hardware address 08:00:20:10:44:eb
bwtwo0 at sbus0 slot 3 offset 0x0: SUNW,501-1561, 1152 x 900 (console)
fd at mainbus0 ioaddr 0xf7200000 not configured
Found boot device sd0

More details gladly supplied on request.

					der Mouse