tech-kern: Re: SMP re-entrancy in kernel drivers/"bottom half?"

Subject: Re: SMP re-entrancy in kernel drivers/"bottom half?"
To: Jason Thorpe <thorpej@wasabisystems.com>
From: Matt Fredette <fredette@theory.lcs.mit.edu>
List: tech-kern
Date: 12/19/2003 12:19:32

> On Dec 17, 2003, at 2:58 PM, Jonathan Stone wrote:
> 
> > Elementary: we have to maintain the invariant ``at most one CPU at or
> > above any given [hardware] prioritly level' or we lose the
> > synchronization semantics of SPLs (higher SPls than the hypothetical
> > SMP-safe interrupt-routine driver entrypoints).
> 
> I don't think that's the way we want to move the kernel, in general.  
> There's also the question of what "above" is.  Technically, splnet is 
> not "above" splbio, but it is allowed to be, by convention, in order to 
> allow network devices to have better interrupt latency than disk 
> controllers.
<
> Think of this this way -- splbio and splnet lock two different sets of 
> data structures.  They are orthogonal, and there is no defined "locking 
> order" for moving between them.

Having the spl* unordered sounds appealing, but see below.

> We currently have a small set of interrupt-frobbing-simplelocks in the 
> kernel that are implemented in an ad hoc way:
> 
> 	s = splfoo();
> 	simple_lock(&foo_slock);
> 
> 	/* manipulate a data structure that foo_slock protects */
> 
> 	simple_unlock(&foo_slock);
> 	splx(s);
> 
> This is all usually hidden inside of macros.
> 
> The logic goes this way:
> 
> 	1. The data structure is actually protected by foo_slock.  It is not
> 	   actually protected by splfoo().
> 
> 	2. Because foo_slock protects the data structure, that prevents other
> 	   CPUs from getting at the data structure while we have it.
> 
> 	3. Because foo_slock can be acquired in interrupt context, we must
> 	   prevent *our* CPU from running that interrupt code path while we
> 	   acquire/hold the lock, otherwise deadlock could result.  Therefore,
> 	   we go do splfoo() before we acquire the lock, and drop ipl after
> 	   we release it.

AFAICT this approach works great, but only if either the spl* are ordered,
or at most one CPU at a time can be in the top half.  Otherwise, can't you 
get this deadlock?

The top half on CPU #0 does splfoo(), acquires foo_slock, and starts 
manipulating.  The top half on CPU #1 does splbar(), acquires bar_slock, and 
starts manipulating.  Then CPU #0 accepts an splbar() interrupt and starts
spinning on bar_slock, and CPU #1 accepts an splfoo() interrupt, which 
completes the deadlock by spinning on foo_slock.

If splbar() > splfoo(), this doesn't happen, right?  Is it common practice,
then, to simply have such an ordering?

-- 
Matt Fredette