Subject: Re: DS3100 kernel lockups
To: William O Ferry <WOFerry+@CMU.EDU>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: port-pmax
Date: 03/16/1997 16:58:13
To pass on a reminder from the NetBSD Core Group: this is a NetBSD
list, not an OpenBSD list.  It's not exactly clear that discussion of
how to fix OpenBSD problems is on-topic here.  I don't really care, as
long as the (primairly NetBSD) problems get fixed.

Just please be careful to discuss things that _are_ relevant to
NetBSD.  (NetBSD developers would of course prefer people run, and
work on, NetBSD, and they provide the hardware that runs the mailing
list.) I think the claims that OpenBSD's decstation device support is
still the same as NetBSD's, and still has all the same bugs, speaks
for itself, but whatever...


>I've been running into lockup problems on my DS3100.  


The SII chip in the Decstation3100s is well-known to be troublesome.
Sometime last year Keith Bostic sent me e-mail describing a couple of
gotchas with the chip that date to 4.4bsd/pmax days: CSRG, Ralph
Campbell and Rick Macklem never got it quite right.

I've since fixed what look like a couple of order-of-expression bugs;
I beleive those are in the NetBSD CVS tree, but aren't in -current.  I
could try putting them in -current and see if they help.  But the chip
(or more accurately, the driver) may be just no good on fast SCSI-2
drives.  I know someone at Berkeley was using a 3100 happily for
several months in 1996, but with an older, slower drive.

The bottom line is I no longer have acess to a 3100; and if/when I buy
a DEcstation, I'd much rather find room for an ioasic-based machine,
to work on r4000 porting.

But if someone can do remote-diagnosis of an NetBSD kernel on a 3100,
I'll be glad to help in a week or so: i'm in the middle of final exams
right now.  One idea is to add a kernel-breakpoint facility into the
dc driver. (There's already #ifdef'ed out code in there to call
panic() if you hit <DO> on an LK-201).  Michael Hitch's r3000/r4000
development tree has a port of an in-kernel debugger; with both the
above, we could at least get a stack traceback in the afflicted
region.

Another thought is to get Jason Thorpe's _splraise() fixes (if they're
done) and see if that makes the problem go away.

Anyone willing to help run such NetBSD kernels and report the results
should contact me, preferably sometime after March 20.