Subject: Re: stray interrupt ipl 0x7
To: khaqq <khaqq@free.fr>
From: Rui Paulo <rpaulo@NetBSD.org>
List: port-sparc
Date: 07/29/2005 20:12:30
On 2005.07.29 18:52:24 +0000, David Laight wrote:
| On Fri, Jul 29, 2005 at 01:18:03PM +0200, khaqq wrote:
| > > 
| > > Anyway, maybe hme(4) is missing some interrupts ?
| > 
| > This happens here quite often under full network load. The CPU seems to
| > spend 70-80% of its cycles in "interrupt" according to top (interrupt handler ?).
| > Transferring about 1GB through the box makes the error happen about 2 or 3
| > times.
| > That's on a SS5/110 with 32MB of RAM, never seems to swap, QFE 2.0,
| > NetBSD 2.0.
| > What would make it "miss" some interrupts ?
| 
| You've got it backwards!
| 
| What actually happens is that device requests an interrupt while the
| interrupt routine is active servicing a previous interrupt.
| 
| The ISR will process the event for the new interupt, write to the
| hardware to clear the IRQ, and then exit.
| At this point we start a race between the hardware seeing the write,
| clearing the IRQ and the (now inactive) IRQ propogating to the CPU,
| and the CPU exting from the interrupt handler and taking the interrupt.
| 
| If/when the CPU wins it (typically) fails to find an ISR that wants
| to service the interrupt and outputs the message.
| 
| On a sparc system the cpu puts writes into a FIFO (the store buffer)
| and will perform all the reads associated with the IRET before
| doing the final write(s) done inside the ISR - so the IRQ line
| if often still active if cleared at the end of the ISR.
| (Posted writes on PCI busses - especially if the actual device is
| behind a few PCI-PCI bridges - just make it more likely.)

Thanks for your great explanation!

| The traditional fix is to perform a read-back of the written address
| to flush the write through the store buffer and PCI bridges.

Does that also applies to SBus ?

		-- Rui Paulo