Subject: Re: Race condition in generic soft interrupts code?
To: None <Richard.Earnshaw@arm.com>
From: Chris Gilbert <chris@dokein.co.uk>
List: port-arm
Date: 07/18/2003 00:02:33
On Thu, 17 Jul 2003 15:11:27 +0100
Richard Earnshaw <rearnsha@arm.com> wrote:

> > At Mon, 14 Jul 2003 22:50:08 +0100,
> > Richard Earnshaw wrote:
> > > It then enters the loop below that scans over hwpend bits.  While 
> > > processing the statclock interrupt the SPL is effectively raised
> > > to IPL_STAT_CLOCK with
> > > 
> > > 		current_spl_level |= iq->iq_mask;
> > > 		oldirqstate = enable_interrupts(I32_bit);
> > > 
> > > Interrupts are then enabled and the statclock handler code is
> > > called:
> > > 
> > > 		for (ih = TAILQ_FIRST(&iq->iq_list); ih != NULL;
> > > 		     ih = TAILQ_NEXT(ih, ih_list)) {
> > > 			(void) (*ih->ih_func)(ih->ih_arg ? ih->ih_arg : frame);
> > > 		}
> > 
> > I think all of lower priority interrupts should masked by ICU
> > hardware before handlers are invoked.  If they masked by hardware,
> > XXX_intr_dispatch is never called by lower interrupts, and the
> > following code become not needed.
> 
> I'm not sure I agree.   It takes quite a few cycles to block
> interrupts, especially if they are controlled by a memory-mapped
> device.  If we work on the principle that the a machine spends roughly
> 1% of the time in interrupt mode and that interrupt sources are
> statistically independent, then 99 out of every 100 interrupt calls
> would end up messing unnecessarily with the interrupt masks.  That
> sounds to me like an overhead that would be good to avoid.  The basic
> framework does this (modulo this problem under discussion).
> 
> I'm not entirely sure, but I think that if you did that, you'd also
> end up having to update the interrupt masks every time spl_xxx() or
> splx() was called.  That will significantly affect performance.

Yes you would, that is how the old code works.  This is one reason why
potr did his new version, it avoids the overhead of writing to hardware
when it's really not needed, and the number of spl*'s is fairly high, so
updating a mask in memory (and probably in cache) is beneficial.

> > > 		if (pcpl & ibit) {
> > > 			/*
> > > 			 * IRQ is masked; mark it as pending and check
> > > 			 * the next one.  Note: the IRQ is already disabled.
> > > 			 */
> > > 			ifpga_ipending |= ibit;
> > > 			continue;
> > > 		}
> > 
> > ixp12x0_intr_dispatch() in ixp12x0_intr.c does it.  It simply mask
> > the lower interrupts before calling handler and simply restore mask
> > before leaving intr_dispatch().  If lower interrupts are occurred
> > during handler running, ixp12x0_intr_dispatch() is immediately
> > called when mask is restored.
> 
> That shouldn't be necessary.
> 
> It seems that this "bug" has been found by several people porting this
> 
> code, and in each case a different hack has been used to work around
> it.  No-one else has posted about the issue here, which means that
> several ports may still be suffering quietly from the same problem. 
> That's a pity: we have this list for a reason... let's make better use
> of it!

Yes, I agree, actually I believe that potr's code is near enough generic
enough that it could be made generic for most ports, I think a few have
extra quirks, but the dispatcher routine should be the same.

It's my bad for not bringing it to the list when I first found the
problem.  I think I felt uncertain it was a real problem.  Actually I
must try out your version of the fix, it should be much nicer than mine,
as you don't have to take the interrupt twice... 8)

Chris