Subject: Re: bge/ahd interrupt problems: partly resolved (hardware bug)
To: Frank van der Linden <>
From: Edgar =?iso-8859-1?B?RnXf?= <>
List: port-amd64
Date: 03/25/2007 23:36:09
> Let me know what you see.
OK, it may be a HARDWARE bug.

What I saw was that the "receipt" bit was set for pin1 in ioapi1,
but Xintr_ioapic_level10 was not being called. Looked like a missing EOI.
All the vectors and idt entries looked reasonable.

Googling revealed that linux had a strange workaround

 * It appears there is an erratum which affects at least version 0x11
 * of I/O APIC (that's the 82093AA and cores integrated into various
 * chipsets).  Under certain conditions a level-triggered interrupt is
 * erroneously delivered as edge-triggered one but the respective IRR
 * bit gets set nevertheless.  As a result the I/O unit expects an EOI
 * message but it will never arrive and further interrupts are blocked
 * from the source.  The exact reason is so far unknown, but the
 * phenomenon was observed when two consecutive interrupt requests
 * from a given source get delivered to the same CPU and the source is
 * temporarily disabled in between.
 * A workaround is to simulate an EOI message manually.  We achieve it
 * by setting the trigger mode to edge and then to level when the edge
 * trigger mode gets detected in the TMR of a local APIC for a
 * level-triggered interrupt.  We mask the source for the time of the
 * operation to prevent an edge-triggered interrupt escaping meanwhile.
 * The idea is from Manfred Spraul.  --macro

 So (still with a breakpoint on Xint_ioapic_level10) I set pin 1's mode
 to edge, and voila, I hit the breakpoint. I reset the mode to level,
 continued, and kept hitting the breakpoint. I deleted it and, believe
 it or not, the machine happily runs again.
 I have no idea whether the analysis in the linux comment is correct,
 but the workaround succeeded.

 Anyone in a position to write a similar workaround for NetBSD?
 I will happily test.