Subject: Extremely Unstable NetBSD 1.0/X11R6
To: None <amiga@NetBSD.ORG>
From: Jason Brittain <spicoli@armory.com>
List: amiga
Date: 04/10/1995 02:31:31
First, I'd like to say a very loud THANKS to everyone involved in the
development and debugging of NetBSD-Amiga and NetBSD in general!  The OS 
is nothing short of incredible IMHO.  My problem is that NetBSD is 
*extremely* unstable and I would like to know if anyone has any ideas how 
I could fix it or have it crash less often.  It crashes about once a day 
if I don't even use it (users call in and use it), but if I use it it crashes
several times a day..  I've been reading this mailing list for several
months, and finally decided my situation was bad enough to post questions
about here.

I run NetBSD on the following hardware:

              Amiga 3000 68030 25Mhz 68882 25Mhz
              RAM: 2M CHIP, 8M FAST
              Micropolis 2.1 Gigabyte SCSI hard drive
              MultiFaceCard ]I[ dual serial/single parallel card
              A pair of USR Sportster v.34 28.8k baud modems

I run NetBSD 1.0 GENERIC with a kernel that I compiled.  I added in support
for my MultiFaceCard ]I[.  I got the driver out of -current, and with the
help of Michael Hitch and Michael Van Elst it finally works almost perfectly.
(THANKS guys!)  MultiFace serial port 0 is configured as a dialin line for
local users to call into and log in, MultiFace serial port 1 is set up as
my SLIP line to the net, and my built-in Amiga serial is set up to run PPP
to my roomate's Linux box (486DX33) via a 50 foot null-modem cable (yes, it
works).  Local users call on MFC port 0 and can telnet out on MFC port 1.
If anyone would like a copy of this NetBSD 1.0 GENERIC + MultiFaceCard kernel
just send me mail!

Here is what it says when I boot NetBSD from ADOS with "loadbsd -a netbsd":

NetBSD 1.0 (FT2) #: Wed Mar 29 03:39:26 MST 1995
    root@FT2:/usr/src/sys/arch/amiga/compile/FT2
Amiga 3000 (m68030 CPU/MMU m68882 FPU)
real  mem = 8388608 (1024 pages)
avail mem = 6733824 (822 pages)
using 64 buffers containing 524288 bytes of memory
memory segment 0 at 07800000 size 00800000
memory segment 1 at 00000000 size 00200000
mainbus0 (root)
clock0 at mainbus0: system hz 100 hardware hz 715909
ser0 at mainbus0: input fifo 512 output fifo 32
par0 at mainbus0
kbd0 at mainbus0
grfcc0 at mainbus0
grf0 at grfcc0: width 640 height 400 colors 4
ite0 at grf0: rows 50 cols 79 repeat at (30/100)s next at (10/100)s has keyboard
fdc0 at mainbus0: dmabuf pa 0x1e3030
fd0 at fdc0: 3.5dd 80 cyl, 2 head, 11 sec [9 sec], 512 bytes/sec
ztwobus0 at mainbus0
mfc0 at ztwobus0 rom 0xe90000 man/pro 2092/18
mfcs0 at mfc0: input fifo 1024 output fifo 128
mfcs1 at mfc0: input fifo 1024 output fifo 128
ahsc0 at mainbus0
scsibus1 at ahsc0
ahsc0 targ 0 lun 0: <MICROP  1924-21MZ1077811HZ2P> SCSI1 direct fixed
sd0 at scsibus1: 2001MB, 2280 cyl, 21 head, 85 sec, 512 bytes/sec
zthreebus0 at mainbus0
2 mice configured
10 views configured
WARNING: bad date in battery clock

It usually boots fine.  I do often need to set the clock with the "date"
command.  After it boots and I log in on the console, I usually want to
run X11R6 (I use Xdaniver1.01 now, but before when I used XamigaMono and 
the distributed 1.0 GENERIC kernel it did exactly the same things), 
so I type "startx", but if I don't first wait for some hard drive 
activity (the swapper?  I don't know what it is, but about 10-15 seconds 
after it boots and I log in it does something..) it will lock up and 
crash after I enter "startx".  When it crashes, almost every time, this 
is what happens:

It is accessing the hard drive, and locks completely up with the hard drive
light stuck on.  Sometimes one of my xterms seems to be responding to input
but I can't execute commands since that needs to access the hard drive.  If 
I try, it completely freezes.  Sometimes I also see activity on my serial 
ports by my modem lights, but it ends about 10 seconds later when that also
finally locks up.

Since my site runs on a SLIP, and I'm at work 10 hours a day (can't stay
home to make sure the SLIP stays online) I wrote several scripts and three 
binaries so that my system can know when the SLIP goes down (killed at 
the provider's end for instance) and it notifies the user on the dialin 
line what happened, it loads up Seyon (since that's the ONLY program I've 
seen so far that can access and dial out on my MultiFaceCard ports.. cu
gives me some i/o error.) and puts my SLIP connection back up automatically.
Since Seyon is the only one I can use, that means I also have to be 
running X all the time in order for my autoSLIP stuff to work correctly, 
and X could be why it crashes so often.  Or, maybe it's just the heavy
load that X puts on my system..? 

Today I was running X and had two xterms open (and one user was 
logged in and was ftping) and they both locked up but the window 
manager menu still worked so I exited X, and when I got back to text
mode I got this for the first time:

vm_fault(e6000, 1c9a000, 3, 0) -> 1
  type 8, code [mmu,,ssw]: 401070d
trap type 8, code = 40107d, v = 1c9a000
pid = 59, pc = 000A8192, ps = 2300, sfc = 0001, dfc = 0001
Registers:
             0        1        2        3        4        5        6        7
dreg: 07F70002 07f70000 1008C000 07F70000 00002004 00000002 07C32000 00000001
areg: 000FD200 000BB7C8 000F2128 00545384 01C9A000 000BB7C8 FFFFFE1C 0DFFFB98

Kernel stack (FFFFFCE0):
<several lines of hex dump here..>
<way too much to write down on paper and re type in!>

panic: MMU fault
stopped at     0x81434:     unlk  a6
db>


An MMU fault.  I was told by others that my problem might be my SCSI DMA,
and to try turning it off using binpatch, which I did, but not only did
the system seem exactly the same, but the drive access didn't appear to
be any slower, nor did my system crash less often.  

So, any ideas?  Anything I'm doing horribly wrong?  Anything I can do other
than switching to -current and getting even more problems?  (I would, 
wouldn't I?) 

Any and all help is GREATLY appreciated!

.---------------------------------------------------------------------------.
| Spicoli (root@ft2)              |AMIGA3000/25Mhz/10M RAM/Two USRv.34 28.8s|
| Fast Times ][ UNIX SysAdmin     |AMIGA1200/50Mhz/10M RAM/One USRv.34 28.8 |
|                                 |AMIGA500/7.14Mhz/1M RAM                  |
| Spicoli@deeptht.armory.com      |C=PET/1Mhz/32k RAM                       |
`---------------------------------------------------------------------------'