Subject: kern/23819: USB controller occasionally (randomly) dies.
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <rkr@olib.org>
List: netbsd-bugs
Date: 12/21/2003 12:10:00
>Number:         23819
>Category:       kern
>Synopsis:       USB controller occasionally (randomly) dies.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 21 12:11:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Richard Rauch
>Release:        NetBSD/amd64-current, and NetBSD/i386-1.6
>Organization:
n/a
>Environment:
NetBSD socrates 1.6ZG NetBSD 1.6ZG (GENERIC) #6: Thu Dec 18 20:28:19 CST 2003  r
oot@socrates:/usr/netbsd/current/src/sys/arch/amd64/compile/obj.amd64/GENERIC amd64
>Description:
Periodically, I have had the USB controller die on me without warning.  It occurs at random intervals.  On an AMD64 machine it happens frequently enough (and I rely on a USB pointer device enough) that uptimes over a day are painful.  The only cure is to reset the machine.
On an i386, it is not so bad, but still happens from time to time (*far* less often; weeks or more can go by).  I do not know that the two failures are related, but thought that I'd mention it anyway.
The amd64 fails with the message "usbd_setup_pipe: failed to start endpoint, ...", in src/sys/dev/usb/usb_subr.c:usbd_setup_pipe().
Because of these problems, I am on a text console using lynx to file this PR, so I cannot provide full dmesg output readily.  However, the USB device is an "ohci" for the AMD64 (there are ohci0 and ohci1; uhub0 is on the former, and to that uhidev0---the trackball---attaches).
For the i386, it is similar (ohci again; usb0, usb1; uhub0 at usb0; all devices on uhub0; it is *not* running GENERIC, but rather a custom kernel from standard 1.6 kernel sources.)

The error with the i386 is so rare that I do not know offhand whether it is accompanied by any console error messages.  Sorry.  Unlike the AMD64 box, the i386 box does not have a non-USB keyboard, so when its USB dies, it is useless for local access and generally gets reset.
>How-To-Repeat:
Use one of these two machines for a while until the USB subsystem dies.

Sorry, I can't be more specific; the behavior is random.  I cannot firmly associate it to busy activity on the USB ports or to system load, or to lots of activity.

For the AMD64, it would help if I knew why a "pipe" has to be set up for a USB device in the middle of normal USB use.  Then I might be able to think of some pattern that eludes me.
>Fix:
No solution known.
>Release-Note:
>Audit-Trail:
>Unformatted: