Subject: Re: LKMized SBUS driver
To: David Laight <David.Laight@btinternet.com>
From: Don Yuniskis <auryn@gci-net.com>
List: port-sparc
Date: 11/26/2001 13:50:38
>David Laight commented:


>> > Well, I've started work on dbri* at sbus?, but if I call my dbri_init()
>> > function from dbri_attach_sbus(), I get a bunch of unhandled ipl 0x9
>> > messages (or something of the sort) on the screen soon followed by a
>> > 'panic: crazy interrupts'.
>>
>> "crazy interrupts" is triggered when you get more then 10 stray
>> interrupts in 10 seconds.
>
>One thing that causes 'unexpected interrupts' is when the device drives
>clears the pending interrupt request (by writing to the target hardware
>register) just before exiting the ISR.


I find it easier to develop the discipline of clearing the
source of the interrupt at the *top* of the service routine.
I also tend to reenable interrupts (globally) at this point (!)
so that other devices can get serviced (depends on how the
interrupt controller is designed, of course).

Then, just prior to restoring state and exiting (i.e. anything
"expensive"), polling the device's IRQ to see if another
has popped up while the previous one was being serviced.
If so, the action I take depends on the system's design,
the type of device that I am servicing and the other types
of devices in the system.

Of course, this depends on exactly *how* the system (hdwr)
architecture handles IRQ's and whether or not they can be nested, etc.
I don't think this "liberal" approach would work well in
a system that supports a wide variety of devices -- most
of the systems I work with have fixed I/O.

But, it can be useful because it lets you know when you
have exceeded the "real-time" capabilities of the system...
when a device interrupts it's own ISR, it is obvious
that your system can't keep up with it IN THE FACE OF THE
OTHER COMPETING DEVICES IN THE SYSTEM. (sorry to shout)

>The write gets delayed (in all sorts of places, the first is a fifo
>before the data cache) - so the ISR exits before the hardware has
>dropped its IRQ line.
>
>The system then takes another interrupt - which the device driver
>isn't expecting (ie no IRQ bits are set in the hw register).
>Falling off the list of isrs causes the error message.
>
>The only architecture independant way of ensuring the write actually
>happens is to read back the same location.  Hardware interlocks ensure
>the read bus cycle follows the write cycle - all way down to the
>physical chip on the IO card.


Yes.  And, using the approach I mentioned (polling the IRQ just
prior to exiting the ISR) often does this as a side-effect
(since the register to "clear" the IRQ is often located at
the *write* address while the status register is located at the
same *read* address).

>(I suspect that on the sparc cpu the kernel interrupt code could check
>that the IRQ line is still active before outputting the 'unhandled
>interrupt' message.  Last time I looked at sparc low level ISR code
>it was all done in software....)