Subject: INSTOTHER crashes (low-end 386 box)
To: None <port-i386@NetBSD.ORG>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-i386
Date: 03/15/1997 20:58:39
Among the machines I have at home is a small 386 box.  Recently, I
tried to install NetBSD 1.2 on it.

Everything went smoothly, except that the Ethernet card is an ed, but
has only three port/iomem/irq settings, selected by a jumper on the
card.  One of them is 280/d0000/3, the second is 300/cc000/5, and the
third is marked simply "SOFT".  The kernel, at least the one on the
kcoth floppy, has ed0 at 280/d0000/9 and ed2 at 300/cc000/10.  The
first two jumper settings provoke boot-time messages complaining about
the irq difference; the SOFT setting seems to act identically to one of
the other two, I forget which.

I wasn't about to use a zillion floppies to install, so what I did was
I extracted the kernel from the kcoth floppy's filesystem on another
machine, searched for the byte sequence 80 02 00 00 00 00 00 00 00 00
0d 00 00 00 00 00 09 00 00 00 ff ff ff ff (ie, the locator data for
ed0) and noted the offset in the kernel at which it occurred (there was
only one occurrence).  Then I used dd to pick off the pieces before and
after the 09 byte and typed in a 03 byte (by typing ^V^C to cat), then
reassembled the kernel by catting the resulting pieces together.  This
made it recognize the network card on boot.  With this, I could ftp
over the binary sets and install.  Here is a copy of /kern/msgbuf I
saved (leading NULs removed):

NetBSD 1.2 (INSTOTHER) #0: Sun Sep 15 16:49:38 PDT 1996
    perry@jekyll.piermont.com:/usr/src/sys/arch/i386/compile/INSTOTHER
CPU: i386DX (386-class CPU)
real mem  = 3801088
avail mem = 1978368
using 72 buffers containing 294912 bytes of memory
mainbus0 (root)
isa0 at mainbus0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns8250 or ns16450, no fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns8250 or ns16450, no fifo
lpt0 at isa0 port 0x378-0x37f irq 7
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 drive 0: 85MB, 1024 cyl, 10 head, 17 sec, 512 bytes/sec <ST1102AT>
wd0: using 1-sector 16-bit pio transfers, chs addressing
ed0 at isa0 port 0x280-0x29f iomem 0xd0000-0xd1fff irq 3
ed0: address 00:00:c0:00:a3:30, type WD8003EP (8-bit) aui
pc0 at isa0 port 0x60-0x6f irq 1: mono
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 4040 netmask 4048 ttymask 40da
changing root device to wd0a
<5>lpt0: out of paper

Then, I wanted to build a more appropriate kernel than INSTOTHER.
(INSTOTHER is stripped, which means a number of tools won't work; also,
I have a large number of private patches which I want to use.)  And
because of severe disk space shortage (look at the wd0 line, note the
size), I had to do this over NFS.  So I set up NFS on another of my
home machines (the Sun-3/260 NetBSD/sun3 machine that's the main home
workhorse machine) and NFS-mounted some disk from there on /usr/src on
the 386 box.

Then I started building a kernel.  It was taking hours, per file, so I
used "make -n" to generate a script and edited the script to change -O2
to -O, which seemed to help.  The main problem is, every several hours,
it panics.  (This was happening while I was still using make, and -O2,
but I switched to using a shellscript pretty early, since I suspected
make of being part of the problem.  Besides, it takes make a while to
decide what it needs to rebuild, probably largely because it thrashes
badly.)  Finally it happened once while I was sitting next to it, and I
got a ten-finger copy of the errors (there's another machine's console
physically close):

fatal page fault in supervisor mode
trap type 6 code f93b0000 eip f8194327 cs f86d0008 eflags 10216 cr2 f82cb000 cpl e0004040
panic: trap
syncing disks... fatal page fault in supervisor mode
trap type 6 code f8cd0000 cip f811fda7 cs f8110008 eflags 10282 cr2 18 cpl 0
panic: trap

dumping to dev 1, offset 8808
dump 4 3 2 1 succeeded

Of course, savecore is useless because the kernel is stripped :-(, so I
can't do much debugging until I have a real kernel.  And because of the
Ethernet braindamage, it has to be a custom kernel.  So I can't debug
the crashes during kernel build until the kernel build finishes. :-(
It doesn't always panic at the same time; it has lasted as much as 12
hours and as little as 2.

Also, sometimes it just hangs, instead of crashing.  And as if that
weren't enough, the compiler coredumps; the log contains complaints
about cc1 dying on fatal signal, and there are cc1.core files lying
around.  It's not a compiler bug, though; upon rebooting and retrying
the compile of the failed file, it compiles just fine.

Now, I know the machine is short of RAM.  And I know that both com1 and
ed0 are on the same irq.  But AFAIK neither of those should cause this
sort of misbehavior.  Am I wrong?  Should I have set the jumper
differently and patched the locator for ed2 instead?  (Note that
nothing is connected to either of the com ports; the only connections
to the box are keyboard, video, power, and the Ethernet AUI cable.)

In any case, anyone have any idea what's going wrong, and/or what I
might be able to do about it?  ("Get a real machine", while perhaps a
reasonable response, will not be taken very well. :-)

					der Mouse

			       mouse@rodents.montreal.qc.ca
		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B