Subject: Re: Interrupt, interrupt threads, continuations, and kernel lwps
To: Bill Studenmund <wrstuden@netbsd.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-kern
Date: 02/21/2007 21:12:19
Hi Bill,

On Wed, Feb 21, 2007 at 09:50:40AM -0800, Bill Studenmund wrote:

> There are two problems. Vax has both of them, and m68k has at least one of 
> them.
>
> One problem is that some systems, like vax, are modal. There's a 
> difference running something in interrupt handling context and not. Matt 
> noted that the vax has separate interrupt stacks. So interrupt code is 
> more than just code running quickly (low latency) at high priority.

From what I understand, with the vax it's a implementation detail that can
be overcome, albeit a troublesome one. I'm sure there is a way to signal an
external agent to handle the interrupt proper. My understanding is that we
can continue to control the interrupt priority level while that is
happening. Well, we must be able to on some level, otherwise the spl
interfaces wouldn't do anything.

> The other problem, which I know mac68k has too, is that you have to make 
> the hardware shut up as part of the interrupt handling. Otherwise once you 
> exit the interrupt, you'll just reenter it.

Ok, well we discussed this briefly offline, and my understanding is that the
problem is some m68k systems don't have a pic that we can use to control
interrupt sources. That's not a problem; as long as there is some limited
control over the interrupt priority level on the CPU itself, then we have a
way to mask interrupts temporarily.

> So you have to have interrupts remain disabled until this interupt
> handling thread completes. That's not what is in my mind as a result of
> this discussion so far.

Partly as an implementation convenience, and to avoid inversion, the
priority level must remain high. (I described this briefly back in December,
on tech-kern@). It is not something that should present a problem. In order
to complete servicing the interrupt when the handler blocks, we need to will
the ISR's scheduling priority to the LWP that is blocking it, and get
control of the CPU to that LWP in short order. Care needs to be taken to
ensure that does happen quickly, but it is not a huge task.

> > > I think that hard interrupts should simply invoke the handler (at the  
> > > appropriate IPL), the handler will disable the cause of the  
> > > interrupt, optionally it may use a SPIN mutex to gain control over a  
> > > shared section, do something to the device, release the mutex, or it  
> > > may just schedule a continuation to run (either via software  
> > > interrupt or via a kernel lwp workqueue, the former can't sleep/ 
> > > block, the latter can).
> > 
> > I gave this a lot of thought too. As a general solution, I really don't like
> > it because it is unnecessarily expensive, both in terms of execution time
> > and (perhaps more importantly) the effort involved in converting all of our
> > drivers to work this way. Conversely, the changes I have to handle
> > interrupts using LWPs add 29 instructions to a typical interrupt chain on
> > x86, to swap stack and curlwp. It works, and it's a solution that can just
> > be "dropped in".
> 
> So how would this work on other architectures?

The same way. While each architecture has its own peculiarities, they should
follow essentially the same pattern. Vax seems pretty unusual in this
respect.
 
> > To reiterate, there are two reasons I want to use LWPs to handle interrupts:
> > signficantly cheaper locking primitives on MP systems, and the ability to
> > eliminate the nasty deadlocks associated with interrupts/MP and interrupt
> > priority levels. The intent is *not* to rely heavily on blocking as the main
> > synchronization mechanism between the top and bottom halfs. That's why in
> > the near term I want to preseve the SPL system for places where it really
> > does matter. I did a lot of profiling to see where we would need to do this,
> > and the network stack is once place.
> 
> I think a good model would be something like how the z8530tty driver works
> but dusted off. There is a hard interrupt handler that reads the chip. On
> receive, it stuffs characters into a ring buffer and then triggers a soft
> interrupt. Transmit, it stuffs characters into the chip.
> 
> Either way, the hard interrupt handler is small and just does pseudodma. A 
> software interrupt handling routine then comes along and does the heavy 
> lifting.
> 
> I really like the idea of the latter routine being a thread & using 
> mutexes. The former, though, I think should remain a fast little routine.

For PDMA or networking I like that, where you have interrupts occuring at a
very high rate or where there are chokepoints as you funnel work down. As a
general solution I don't like it, and I can't think of any other modern Unix
that works exclusively that way. As I mentioned, it introduces an additional
burden both on the CPU and on developers.

Andrew