Subject: Re: Interrupt, interrupt threads, continuations, and kernel lwps
To: Matt Thomas <matt@3am-software.com>
From: Andrew Doran <ad@netbsd.org>
List: tech-kern
Date: 02/21/2007 10:09:00
Hi Matt,

On Wed, Feb 21, 2007 at 12:08:36AM -0800, Matt Thomas wrote:

> After a great of pondering, I've concluded that interrupt threads are  
> an extremely bad idea.

Fair enough, but why so?

> I think that hard interrupts should simply invoke the handler (at the  
> appropriate IPL), the handler will disable the cause of the  
> interrupt, optionally it may use a SPIN mutex to gain control over a  
> shared section, do something to the device, release the mutex, or it  
> may just schedule a continuation to run (either via software  
> interrupt or via a kernel lwp workqueue, the former can't sleep/ 
> block, the latter can).

I gave this a lot of thought too. As a general solution, I really don't like
it because it is unnecessarily expensive, both in terms of execution time
and (perhaps more importantly) the effort involved in converting all of our
drivers to work this way. Conversely, the changes I have to handle
interrupts using LWPs add 29 instructions to a typical interrupt chain on
x86, to swap stack and curlwp. It works, and it's a solution that can just
be "dropped in".

To reiterate, there are two reasons I want to use LWPs to handle interrupts:
signficantly cheaper locking primitives on MP systems, and the ability to
eliminate the nasty deadlocks associated with interrupts/MP and interrupt
priority levels. The intent is *not* to rely heavily on blocking as the main
synchronization mechanism between the top and bottom halfs. That's why in
the near term I want to preseve the SPL system for places where it really
does matter. I did a lot of profiling to see where we would need to do this,
and the network stack is once place.

> While IPL will need to continue to exist for talking to hardware, I  
> think software priority levels will eventually disappear as they  
> currently exist.  They will transition to actual real-time priorities  
> used by the scheduler to run kernel lwps (< SPL_SCHED) or become  
> mutexes (>= SPL_SCHED).

To add a bit of information about what I am proposing: the interrupt
priority levels continue to exist, and map onto scheduling priorities.
Interrupts at or below IPL_VM are provided with LWP context on execution,
above that level things would work (for the most part) as they do now,
with spinlocks being used to provide MP atomicity where needed.

To contrast a bit with the FreeBSD implementation, since it has been said we
are going to endure the same kind of performance degredation they have seen:
interrupts involve a lengthy path through C code, after which (from my
reading) a thread is scheduled to handle the interrupt, and the preempted
thread yields the CPU via mi_switch(). That involves taking multiple locks
along the way, touching lots of additional cache lines, making multiple
context switches and so on.

Cheers,
Andrew