Subject: Re: Interrupt, interrupt threads, continuations, and kernel lwps
To: Bill Studenmund <wrstuden@netbsd.org>
From: None <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 02/23/2007 08:51:36
In message <20070223042756.GA19996@netbsd.org>, Bill Studenmund writes:
>On Thu, Feb 22, 2007 at 04:49:40PM -0800, jonathan@dsg.stanford.edu wrote:


>> [...] I think
>> the problem is terminology.  I'm not used to seeing taking an
>> interrupt called a "context switch" instead of, well, taking an
>> interrupt or a trap.
>
>Well, it is. :-) It is one form of context switch, and all of the
>context-switchy things it does are a main part of its expense.

Hi Bill,

If someone says "context-switch" and "vax" in the same paragraph, I
can't help but think of "ldpctx/svpcxt" instructions. (Ouch.)
Or about flushing virtually-indexed cache, for CPUs where that's
needed.  Those are "context-switchy" things, and it's vital that
we don't pay those penalties to handle  interrupts.

The way I see it, we're really talking about a *stack*-switch,
*precisely* to defer those expensive parts of a context switch.
Andrew is leveraging the fact that the entire KVA is mapped into every
process.  So entering interrupt mode does the "right thing".  Except,
as Matt noted, for processors which have a special interrupt mode.


>> That's an unfortunate choice of terminology. it'S inviting confusion
>> to say that a new approach does just what we did before, when it
>> doesn't, and the details matter.
>
>True.

>For the common case, we will have almost the same behavior as now, except
>that we can take mutexes in an interrupt handler.
>
>As I understand it, the difficulty comes in when the mutex is held by a
>thread that is not running. In that case, the interrupt handler blocks. In
>that case, the interrupt handler has to be blocked, and the interrupt has
>to be disabled/ignored until serviced. For systems with a PIC, we disable
>the interrupt and cope.

I think we *need* to make sure that those mutexes are always spinlocks.
If we don't, we make a mockery of interrupts being things that're
serviced with low latency.  I mean, consider the ARM case: to get
to the interrupt thread, we have to switch *away* from the context
which was active when the interrupt arrived (i.e., pay the flush penalty);
then wait until we run whatever thread holds the mutex blocking the
interrupt thread, so that the mutex-holding thread can release the mutex;
then we need to switch back to the interrupt thread.

There are chaeper cases (perhaps the last one can be a no-vmspace-change
switch; perhaps the active thread when the interrupt fired is already
holding the mutex needed by the interrupt thread).

But as I see it, the bottom line is that NetBSD runs, and is used,
platforms where the blocking case inevitably adds painful, painful
latency.  (Bucky Katz' is one such).    If I was in that situation,
I'd be deeply, deeply unhappy about the prospect of truly terrible
interrupt latency, even if it's a "rare" case.

The way I see it, comparison to the SunOS-5 threading model isn't
really useful, since Solaris doesn't support processors where the full
switch is painful (ARM, vax, perhaps MIPS r4k without l2 cache?) and
has no need or (AFAICT) desire to do so.

>As noted, this will hopefully be a rare occurrence, where a thread that
>holds an interrupt mutex is no longer on a processor. We can structure our
>code so that this is very unlikely, if not impossible.

I think we *need* to do that, to continue good performance on machines
where the full switch is prohibitive.  If we collectively buy that,
then that decision drives us into a "top half" / "bottom half" model.

>Note that in the case of the mutex being locked but the thread being on
>another processor, we will just spin-wait. That's a feature of the locking
>we copied from Solaris. I expect that this _will_ happen, but we then get
>the exact same behaviors as if we had spinlocks.

I think that's a good thing, though of course to scale well, we need
to restructure the kernel so that this is very rare case.  (Anytime
this gets mentioned, I worry we'll go the same route everyone else
does when they do SMP for the first time: sprinkling per-subsystem
mutexes everywhere, leading to high contention, poor cache/memory
performance, etc., etc, etc.)

>Single ande multi-CPU to be sure. But as for the other stuff, it's mostly
>what we have now. So I don't see how we will have radially-different
>results.

I anticipate radically different results, on machines like ARM,
if the blocking case is ever anything but very, very rare. Heck,
even one blocking event would totally ruin a serial driver's day.


>We should test stuff over time to make sure we don't do something stupid,
>to be sure!

>I don't think that will mater for interupt handlers. Yes, I think there's
>a lot of work to do for the networking stack. But I think it'll be
>different work.

>As above, trying to get the lock while a thread running on another CPU has
>it turns into a spin wait. That's not the slow case. So as long as we
>don't hold a mutex we take in interrupt context while we do something else
>that can block, we will NOT trigger the slow path in an interrupt handler.
>We don't sleep while holding SPL now (or we aren't supposed to), so a very
>direct 1:1 translation should be fine.

I'm not clear on what you mean. What happens to current SPL-protected
code (spl[soft]net)?  

To put that another way: what's the migration path for the rest of the
kernel? Do we still have system-wide SPLs, and use it in code which
isn't totally mutex-ified?  Or do we need a "flag day" to turn all SPL
synchronizations into mutex ops or locks?