Subject: Re: Bug in x86 ioapic interrupt code for devices with shared interrupts?
To: None <tls@rek.tjls.com>
From: None <jonathan@dsg.stanford.edu>
List: port-i386
Date: 03/03/2006 13:17:35
In message <20060303210219.GA13248@panix.com>,
Thor Lancelot Simon writes:

>One problem is that it's not at all clear to me what putting the bge
>hardware in "in interrupt handler mitigation mode" if it did not
>actually interrupt you will do.  That's what the next line of if_bge.c
>does.
>
>But I think there is another problem.  See attached diff -- we have
>been, according to the Linux tg3 driver, reading the wrong register,
>and in fact if you don't read the right one, if you're not using MSI
>(which we aren't), it's possible to start processing an interrupt
>while the status block for the chip is in an inconsistent state.  That
>might also be responsible for some of the chaos.
>
>I cannot test this right now.  I would appreciate it if someone else
>would.  The relevant part of the tg3 driver is tg3_interrupt() in
>tg3.c.


Oh, darn. I've spent many day reading the Linux drivers, distilling
out how to do TSO on bges and collecting it in my own head.

I don't recall exactly where, but I have a strong nagging feeling even
your patch (if correct) will leave us exposed to similar races with
PCI-e attached devices.