tech-kern: Re: Interrupt, interrupt threads, continuations, and kernel lwps

Subject: Re: Interrupt, interrupt threads, continuations, and kernel lwps
To: None <tech-kern@netbsd.org>
From: Pavel Cahyna <pavel@netbsd.org>
List: tech-kern
Date: 06/15/2007 23:20:42
On Thu, Jun 14, 2007 at 02:11:17PM +0100, Andrew Doran wrote:
> On Sat, May 05, 2007 at 02:16:17PM +0100, Andrew Doran wrote:
> > On Wed, Feb 21, 2007 at 10:09:00AM +0000, Andrew Doran wrote:
> > 
> > > On Wed, Feb 21, 2007 at 12:08:36AM -0800, Matt Thomas wrote:
> > 
> > > > I think that hard interrupts should simply invoke the handler (at the  
> > > > appropriate IPL), the handler will disable the cause of the  
> > > > interrupt, optionally it may use a SPIN mutex to gain control over a  
> > > > shared section, do something to the device, release the mutex, or it  
> > > > may just schedule a continuation to run (either via software  
> > > > interrupt or via a kernel lwp workqueue, the former can't sleep/ 
> > > > block, the latter can).
> > >
> > > To reiterate, there are two reasons I want to use LWPs to handle interrupts:
> > > signficantly cheaper locking primitives on MP systems, and the ability to
> > > eliminate the nasty deadlocks associated with interrupts/MP and interrupt
> > > priority levels. The intent is *not* to rely heavily on blocking as the main
> > > synchronization mechanism between the top and bottom halfs.
> > 
> > So I've given this more thought, and I now think that a hybrid approach is
> > the way to go. I think that minimizing the level of change to interrupt
> > handling on the various platforms is important, since it's particularly
> > tricky. Here's my updated proposal:
> > 
> > => hardware interrupts
> > 
> > Hardware interrupts would function as Matt describes, but with work still
> > being handed down via soft interrupt. We would need to reduce the amount of
> > work done in interrupt handlers. For example, calls to biodone() from
> > interrupt handlers would be replaced by biointr(), and a soft interrupt
> > handler would call biodone().
> > 
> > => software interrupts
> > 
> > Where it's possible, software interrupts would work as I described before.
> > They borrow the interrupted thread's VM context, and are able to block.
> > Where the machine is modal, or for bringup, or where there a lack of time
> > or interest, software interrupts can be implemented in an MI way using
> > kthreads.
> > 
> > Soft interrupt handlers would be per-CPU. If a soft interrupt is triggered
> > on a CPU, it must occur on that CPU. On x86 at least, it is currently
> > possible for another CPU to snarf it and clear the pending status. In the
> > long run, we may want the ability to direct soft interrupts to other CPUs,
> > but only if the driver asks for it. Each LWP dedicated to handling a soft
> > interrupt would be bound to it's home CPU, so if it blocks and needs to run
> > again, it would only run there.
> > 
> > The per-CPU requirement means it would be possible to hand work down to the
> > soft interrupt handler without using locks.
> > 
> > Software interrupts would not be able to:
> > 
> > - sleep using condition variables
> > - use lockmgr()
> > - wait for memory to become available (eg: KM_SLEEP, PR_WAITOK, ...)
> > 
> > => primitives
> > 
> > Document mutex_spin_enter/mutex_spin_exit for device drivers, which avoids
> > a costly trip through mutex_enter/mutex_exit for spinlocks. All interrupt
> > levels become able to use mutexes. It's not possible now for serial
> > interrupts or IPIs on x86. So what is called IPL_LOCK is replaced by
> > IPL_HIGH. That's mostly used by the lockdebug and lockstat code.
> > 
> > => spl hierarchy and facilities
> > 
> > The soft interrupt levels would cease to exist - at least in the long run.
> > There are a few places we may still want the ability to block softnet until
> > we can fix the concurrency issues.
> > 
> > I propose that we then flatten the hierarchy to look like this:
> > 
> > o IPL_NONE
> > 
> >   Usual state of the system, no interrupts blocked.
> > 
> > o IPL_LOW
> > 
> >   Blocks all "low priority" hardware interrupts. Mostly equivalent to
> >   splvm/splimp, but with the additional guarantee that it will block
> >   anything that can take the kernel lock. By its nature, blocks soft
> >   interrupts from occurring.
> > 
> >   What interrupts at this level can do is restricted further. It would not
> >   be possible for them to send signals to processes or inspect any process
> >   state. That all needs to be deferred to a software interrupt. It would be
> >   possible to wake LWPs using cv_broadcast()/cv_signal().
> > 
> >   The VM system would run at this level, so it's still possible to
> >   allocate/free memory. Longer term I think it may be worthwhile restricting
> >   interrupt handlers' view of the VM system to eg: pool_get, pool_put.
> > 
> > o IPL_MID
> > 
> >   Blocks mid level interrupts, like the clock or (for example) audio
> >   interrupts, and also blocks everything at IPL_LOW. Similar to what
> >   IPL_SCHED does now.
> > 
> >   Handlers at this level would have essentially the same capabilities as
> >   IPL_LOW, but would not be able to make use of the VM system, and would not
> >   be able to take the kernel lock. The scheduler would run at this level.
> > 
> > o IPL_HIGH
> > 
> >   Blocks all high level interrupts, like: statclock, IPIs (x86), serial. 
> >   Also blocks everything at lower levels.
> > 
> >   Handlers at this level would be even further restricted in what they can
> >   do. The synchronization mechanisms available to them would be: scheduling
> >   a soft interrupt, using spin mutexes, and using the spl calls. They could
> >   not call e.g. cv_broadcast(), or acquire the kernel lock. By extension, it
> >   would not be possible for LWPs to sleep at IPL_HIGH.
> 
> I plan to implement this over the next couple of months. Some of the changes
> involved:
> 
> o Add a cpu_intr_p() that returns true if currently handling a hardware
>   interrupt. This would be used so (for example) biodone knows whether
>   or not to defer processing to a soft interrupt.
> 
> o Increase the number of available priority levels to 256 as discussed
>   earlier. The priority space is expanded to include real time and
>   (soft) interrupt threads.
> 
> o Pull in the less invasive changes from the vmlocking branch: those that
>   do not touch the vm or vfs system. A couple of these are: the ability
>   to create bound kthreads, and kthreads running as lwps in proc0.
> 
> o Flatten the spl hierarchy as described above.

What would this mean? Replacing splbio, splnet, etc. by
spllow/splmid/splhigh?

Pavel