Subject: Re: how do I tell if I'm on the interrupt stack?
To: Greg Lehey <grog@lemis.com>
From: Matthew Jacob <mjacob@feral.com>
List: tech-kern
Date: 09/06/2001 21:57:22
> On Wednesday,  5 September 2001 at 20:28:57 -0700, Matt Jacob wrote:
> > On Wed, 5 Sep 2001, Jason R Thorpe wrote:
> >> On Thu, Sep 06, 2001 at 12:14:19PM +0930, Greg Lehey wrote:
> >>
> >>> All else being equal, given the choice between a variable or a
> >>> function, I'd use the variable any day.
> >>
> >> I'd rather restructure the entire code path so the test is
> >> never necessary :-)
> >
> > Then we need to implement ithreads as Solaris implements them- so you *can*
> > call tsleep from hardware interrupt level
> 
> That would definitely be overkill for this purpose.

No, I don't think so. If you don't make any test, then either you never can
call tsleep or you should be able to call it from interrupts.

> 
> > and as FreeBSD supposedly gropes toward.
> 
> Well, "supposedly" is wrong there, unless you mean the question of
> whether FreeBSD is intentionally imitating Sun.  I did the initial
> implementation of the threads long before I knew that Sun had
> something similar.  The real issue is whether the interrupt threads
> will be as useful as they promise to be.

"Supposedly" is reflective of the massive gulf between initial reported
architecture and day to day implementation experiences. I've given up even
venturing a guess about the FreeBSD SMP effort.

> 
> > I'm pretty convinced that lacking such freedom, knowing whether it
> > is safe to call tsleep is functionality worth having. The
> > implementation need not be hacky, and could, in fact, be SMP
> > aware. After all, it's per processor data (in some systems, at
> > least) as to whether you're running a context where it is safe to
> > sleep && be (re)schedulable. Despite what Bill says, it need not be
> > a costly test- but this is why I asked whether it would be
> > invasive. The answer I get is "Yes" and the implied (from Jason)
> > "You need to rewrite your driver"- which is somewhat annoying.
> 
> I'm neutral on this one.  Basically, it's a single instruction in the
> interrupt stub (increment interrupt nesting level) and another in the
> return (decrement), so it's not much overhead.  But it would probably
> result in a cache miss on entry, probably not on exit, and that's
> equivalent to about 100 instructions.

Relative to all of the cache misses you would take for the execution of a
moderate interrupt service routine? No- in this case, if time is of the
absolute essence, then you *would* code differently. If this were sparc, you'd
also run in the trap window and not even have a real stack to make any calls
with either.

But for your basic system interrupt handler where you let h/w priority do it's
job the overhead of testing one variable is almost certainly less than the
cost of the lock acquisition you have to do anyway.

> 
> > Frankly, the alternative is a less functional system. But you know-
> > I *really* don't need to argue this at all. I think it would be
> > useful. If others don't see that- that's fine too.
> 
> The real issue is whether it's always possible to find an alternative.
> In this case, one such might be to set your own variable when entering
> your interrupt routines, and resetting it on leaving them.

But that's the problem. I do- but that is not sufficient when a hardclock
driven routine is caused entry into a routine where I *can't* make that test.

The normal path was (roughly) expected to be:

isp_pci_intr
  isp->isp_onintstack = 1;
    isp_intr
    ...
    (SCSI midlayer)
      ...
      isprequest
	isp_start
	(exception condition causes)
	  isp_mboxcmd
	    if (isp->isp_onintstack)
	      POLL
	    else
	      tsleep
	  ...


The problem that broke me was:

isp_pci_intr
  isp->isp_onintstack = 1;
    isp_intr
    ...
    (SCSI midlayer- called with BUSY status on SCSI command)
      (kthread schedule of scsi_completion, which then does
      callout_reset so that in 1 second, essentially)
      ...
      isprequest
	isp_start
	(exception condition causes)
	  isp_mboxcmd
	    if (isp->isp_onintstack)
	      POLL
	    else
	      tsleep
	  ...

But because isp->isp_onintstack isn't set, I called tsleep. Boom.


-matt