Subject: DS3100 ethernet spl problem fixed
To: None <port-pmax@NetBSD.ORG>
From: Arne H. Juul <arnej@pvv.unit.no>
List: port-pmax
Date: 12/31/1995 01:45:14
   Well, I've been studying the ipl levels and interrupt structures
on the original pmax (2100/3100) machine now, and I think I understand
what's going on.  First, the executive summary:  The main problem for me
is in machdep.c, probably a typo or past-o.  Here's a fix:

--- machdep.c	Thu Dec 28 13:23:50 1995
+++ machdep.c.ahj	Sun Dec 31 01:19:43 1995
@@ -402,7 +402,7 @@
 		Mach_splbio = Mach_spl0;
 		Mach_splnet = Mach_spl1;
 		Mach_spltty = Mach_spl2;
-		Mach_splimp = Mach_spl2;
+		Mach_splimp = Mach_spl1;
 		Mach_splclock = Mach_spl3;
 		Mach_splstatclock = Mach_spl3;
 		Mach_clock_addr = (volatile struct chiptime *)


However, this fix isn't really completely good, because of slip and ppp
possibilities.  splimp should block 'everything that could change any
network structures', if I've understood things right.  Charles said:

[...]
> splimp() is supposed to block all of the things spltty(),
> splnet(), splbio(), and splsoftclock() would block.
[...]
> Since splimp() also needs to include splbio() and spltty() (for ccd,
> if_slip and if_ppp), running network drivers at splimp() is annoying,
> as it increases the latency for some higher priority interrupts.
[...]

Obviously splimp() must block network interrupts, and because of slip/ppp
also network interrupts.  Blocking soft interrupts (softclock and softnet)
makes sense to me too - those are after all soft (low-pri).  However,
I don't understand why splimp() needs to block 'disk-type' (bio) interrupts
as well - "for ccd"?  I've looked a bit at the ccd driver and couldn't
find anything there.  Charles, could you please explain this a bit further?

[I really like your idea of splitting splimp into splimp and new splnet,
 btw.  Makes a lot of sense now that I understand things a bit more...]

If we really need to block tty, net, bio and softclock there's not much
left (hardclock of course).

Anyway, for kn01 (==3100) we need to make a new splimp() variant that
does either spl1+2 (net+tty) or spl0+1+2 (bio+net+tty), to get it really
right.  Since I don't use slip or ppp or ccd, I've put mine at just
spl1 (like the patch above) at the moment and I'm pleased to say that
the machine is now *rock solid*.

A related issue:  Currently, on executing network (lance) interrupts
on 3100, all other interrupts except hardclock are disabled.  I've
experimented with enabling the scsi interrupts first, and this seems
to have a positive effect on transfer times (for ftp and suchlike that
actually goes to disk).  Does this sound like a generally good idea?
Should we also process serial interrupts first and re-enable them
as well, to get a better chance of getting decent serial throughput?

[If anyone want me to I can give a more detailed picture of the 3100
 interrupt structure as I currently understand it, but remember that the
 other machines of the family (5000/* and so on) are quite different.]


  -  Arne H. J.