Subject: Re: USB controller goes nuts on Alpha
To: Lennart Augustsson <lennart@augustsson.net>
From: Nathan J. Williams <nathanw@MIT.EDU>
List: current-users
Date: 02/08/2000 15:23:07
Lennart Augustsson <lennart@augustsson.net> writes:

> Yes, probably.  You could debug the ohci_intr1() routine if you want to
> see what happens.

I started by commenting out the code that complains about scheduling
overruns, to trigger the "block unprocessed interrupts" code and see
what it did if I ignore it. No complaints on boot, but when I plugged
in the videocam I got:

ohci_intr: sc=0xfffffe000003f000 intrs=65(40) eintr=40
ohci_rhsc: sc=0xfffffe000003f000 xfer=0xfffffe0000003900
hstatus=0x00000000
ohci_rhsc: change=0x02
SPLUSBCHECK failed 0x3!=0x4, ../../../../dev/usb/usbdi.c:778

Diagnostic checks failing always give me hope that the actual problem
is close at hand, but it turned out not to be the case. Rather, the
SPLUSBCHECK macro is making inappropriate assumptions about the return
value from spl*():

#define SPLUSBCHECK \
        do { int _s = splusb(), _su = splusb(); \
             extern int cold; \
             if (!cold && _s != _su) {printf("SPLUSBCHECK failed
#0x%x!=0x%x, %s:%d\n", \
                                   _s, _su, __FILE__, __LINE__); }\
             splx(_s); \
        } while (0)

Comparing the return values from spl calls isn't really
appropriate; I don't think historical practice has ever let you do
anything with them besides hand them back to splx(). 

On the Alpha, device interrupts can come in at IPL 3 or
4; the spl calls that block device interrupts set the IPL to 4. The
ohci interrupt happened to come in at 3, so _s got the value 3 and _su
got 4, even though further ohci interrupts would be blocked. 

I don't know what the right way to fix this is. But it doesn't cause
real breakage, so I'm not going to worry about it too hard.


When I plug in a device now (with the overrun interrupt blocked) I get
a large pile of debugging output (saved at
http://web.mit.edu/nathanw/www/usb-plug-spew), but the final result is:

uhub0: device problem, disabling port 1

I don't know if this is worth looking into - would disabling the
overrun interrupt break the ohci state enough that other things would
be expected to fail?

        - Nathan