Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: lost interrupt on Blade 100/150 with "newer" OBP



On Sat, 16 Aug 2008, doomwarrior wrote:

> Hi everybody,
> 
> I got a Blade 150 today and run into the "well known" lost interrupt issue of
> the aceride controller.
> I started to dig around, what could be the reason for this. It is known that
> OpenBSD don't have this problem anymore(?). If I remeber correctly Michael
> Lorenz mentioned some time ago, that the interrupt mapping with the newer OPB
> could be the error. So I startet to compare the sourcecode of ofw_matchdep.
> There are serveral fixes in OpenBSD, but mainly targeting sun4v. I grap
> OpenBSD 3.3 to ensure that non of those minor changes  are linked to the Blade
> interrupt problem.

Rather than look at the differences in code bases it might be easier to 
just debug the problem.

The way interrupts work on UltraSPARC machines is that each bus controller 
has an interrupt concentrator.  Each interrupt is wired to a separate 
signal in the interrupt concentrator.  When the interrupt concentrator 
detects an interrupt, it sends an interrupt packet (or "mondo") over the 
UPA bus to one of the CPUs for dispatch.  When the CPU gets the packet, it 
looks up the source of the interrupt in the table and generates a hardware 
interrupt at the appropriate level.  But you shouldn't care about that 
part.

The interrupt concentrator has an interrupt enable and interrupt clear 
register for each possible interrupt source (and a few extras) as well as 
an interrupt state register.  OpenBoot advertises the interrupt numbers 
in the device node "intr" or "interrupts" property, and also has an
one or more "interrupt-map" properties to map that number to a device 
phandle and interrupt controller interrupt value.  This is documented in 
one of the 1275 recommended practices documents.  OF_mapintr() uses this 
algorighm to try to figure out which mondo OpenBoot is telling us to use.

What you can do is do something to trigger the interrupt at the IDE 
driver, then examine the interrupt diagnostic registers to see which 
interrupt is in the pending state.  This will tell you what the mondo for 
that interrupt really is.  Then by comparing that to the results of 
OF_mapintr() you can probably find and fix the bug.

I keep thinking I'll do this one of these days, but I just don't have the 
time.

Eduardo 


Home | Main Index | Thread Index | Old Index