Subject: splclock and setrunqueue clarification
To: None <tech-kern@NetBSD.ORG>
From: R.Gopalakrishnan <gopal@dworkin.wustl.edu>
List: tech-kern
Date: 07/28/1996 12:03:43
Hi
Some clarification of my post about splclock() and setrunqueue()
is in order. Firstly, I'm talking about the 386 (PC) port of NetBSD.
In the PC port, spl functions do not touch the ICU. Instead they
just set the global "spl" integer variable to mask various interrupts.
On my system the "spl" variable takes on the following values:
splclock()    - 0xE000 0001
splnet()      - 0xE000 4C40
splimp()      - 0xE000 5CDA
spltty()      - 0xE000 5CDA
splhigh()     - 0xFFFF FFFF

Only the lower two bytes of "spl" are relevant since there are only
16 irq_num levels (2 ICUs, each with 8 levels).
Now see the macro INTR (i386/isa/vector.s) that shows how the "cpl"
value determines which interrupts are taken and which are blocked.
It turns out that splclock() only blocks irq_num 0. All others are
allowed to go through. In contrast splhigh() blocks all 16 irq_nums.
splnet() blocks irqs 6,10,11,14. splimp() blocks irqs 1,3,4,6,7,10,11,
12,14. 

Given this, my earlier post makes more sense. in userret() the call to
setrunqueue() is blocked using splclock(). When setrunqueue is modifying
the "qs" variable and the "p_back" and "p_forw" of process "p", a
higher priority intr can come in. This can call wakeup() which in turn
calls setrunqueue() with disastrous results. In fact, I actually caught
the "bio" interrupt re-entering setrunqueue yesterday.

Solution: I went inside "remrq" and "setrunqueue" and bracketed their
access to the "qs" and proc structures with a cli and an sti. It seemed
to solve the problem *somewhat*. The system runs for a lot longer time
without freezing up. But there is still something else that is giving
trouble.

-gopal