Subject: Re: splx() optimization [was Re: SMP re-eetrancy in "bottom half" drivers]
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 06/02/2005 22:15:13
In message <1117765903.256905.1390.nullmailer@yamt.dyndns.org>,
YAMAMOTO Takashi writes:

>> So interrupt-handling can be per CPU (really, per device instance: if
>> you have more CPUs than SMP-safe drivers, the extra CPUs cant help
>> interrupts).  Whereas raising SPLs will acquire a (global) lock.
>
>can you please explain why do you think raising SPLs will acquire a lock?
>
>if you mean something like the following, please don't.
>please keep splxxx() functions cpu local.
>
>	int
>	splbio()
>	{ 
>		int s = _current_implementation_of_splbio();
>		spinlock(&splbio_lock);
>		return s;
>	}

Yamamoto-san,

I'd like very much to agree with you and abide by your request.
I regret sincerely that I foresee great difficulty in complying.

The way I see it, the entire point here is to emulate the existing
synchronization semantics of spl()s, __for drivers and subsystems
which have not yet been modified to be SMP-safe.  For those drivers we
*need* to preserve the existing semantics,.whereby raising spl to a
given level guarantees synchronized, race-free access to
data structures accessed at that SPL level.  The code fragment above,
the one you ask us not to do, is very close to what I see as what we
*have* to do --- but only as a stop-gap, for drivers and code which
are not yet reworked to be SMP-safe.

Again, the way I see it, SMP-safe drivers should be reworked to not
require SPL synchronization; but to instead use explicit
synchronization such as locks.  Your Dec 2003 changes, to make ipintrq
and friends use explicit locks, are one very good example; so are the
the changes to fxp(4) to use per-device-instance locks.
 
I understand a desire for local-CPU-only ways to block interrupts at a
specified level (or below). If you want that, I'd support it.  But if
we are, collectively, looking for evolutionary steps to a non-biglock
kernel, I think those functions should have new names, different from
the existing spl*() functions.  That way, an spl*() call is a marker
for code that still needs rework to use explicit, non-SPL synchronization.
Does that make sense?

I understand spl*() functions a no-op in FreeBSD-5; maybe they also
have prior art for local-CPU selective masking of interrupts.