Subject: Re: splx() optimization [was Re: SMP re-eetrancy in "bottom half" drivers]
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 06/06/2005 11:45:37
In message <1118040352.710193.1385.nullmailer@yamt.dyndns.org>,
YAMAMOTO Takashi writes:
>> Or, to put that another way: given where NetBSD is today, how would
>> you propose to get from the current state, to a kernel with
>> fine-grained SMP synchronization?
>
>tackle the highest IPL first.

Oh. I see. I had not grasped that part before.

That's certainly an internally-consistent strategy, but not at all
where my interest and focus lies (i.e., networking code). And it
quantizes improvements at the granuarity of *all* code at a given IPL.
Whereas the lock-per-IPL lets us proceed as in your patch from 2003.
The hierarchy of IPL-bound locks lets us do that without flattening
all device interrupts/IPLs to a single level (which is the part I
think you described to Stefan as the "hack" part of that patch. ).

I'm also skeptical that we have developer resources to add explicit
SMP locking across such large volumes of code as the "unit of least
effort" or "least commit" whilst still maintaining the quality for
which NetBSD is known.

So how about a compromise:

Those who wish can work from highest IPL downward, in change quanta of
"all code at a given IPL"; while those who wish to follow the
lock-per-IPL can do that in whatever order makes sense.  (Assuming the
bitmask-style issues and idemppotent-spl-raising Bill noted are fixed,
or fixable).

Naturally, anytime the highest-IPL-down effort SMP-safes each and
every use of any given IPL, then the implicit spinlock at that IPL
then disappears.  That way, you're no worse off than we are now
(unless, maybe, the locking overhead of the per-IPL locks makes the
kernel slower than what we have now, in which case we have a real
problem with lock cost.)

How does that grab you?  I think that gives us the best of both
worlds: individual drivers can be made SMP-safe without (too much)
consideration of other drivers at the same IPL or of other subsystems.

Yamamoto-san, I envisage little or *no* changes to individual drivers
in going from the approach I outline, to the one you suggest.
If that's really so, where's the downside?