Subject: kern/23819: USB controller occasionally (randomly) dies.
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <rkr@olib.org>
List: netbsd-bugs
Date: 12/21/2003 12:10:00
>Number: 23819
>Category: kern
>Synopsis: USB controller occasionally (randomly) dies.
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Dec 21 12:11:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator: Richard Rauch
>Release: NetBSD/amd64-current, and NetBSD/i386-1.6
>Organization:
n/a
>Environment:
NetBSD socrates 1.6ZG NetBSD 1.6ZG (GENERIC) #6: Thu Dec 18 20:28:19 CST 2003 r
oot@socrates:/usr/netbsd/current/src/sys/arch/amd64/compile/obj.amd64/GENERIC amd64
>Description:
Periodically, I have had the USB controller die on me without warning. It occurs at random intervals. On an AMD64 machine it happens frequently enough (and I rely on a USB pointer device enough) that uptimes over a day are painful. The only cure is to reset the machine.
On an i386, it is not so bad, but still happens from time to time (*far* less often; weeks or more can go by). I do not know that the two failures are related, but thought that I'd mention it anyway.
The amd64 fails with the message "usbd_setup_pipe: failed to start endpoint, ...", in src/sys/dev/usb/usb_subr.c:usbd_setup_pipe().
Because of these problems, I am on a text console using lynx to file this PR, so I cannot provide full dmesg output readily. However, the USB device is an "ohci" for the AMD64 (there are ohci0 and ohci1; uhub0 is on the former, and to that uhidev0---the trackball---attaches).
For the i386, it is similar (ohci again; usb0, usb1; uhub0 at usb0; all devices on uhub0; it is *not* running GENERIC, but rather a custom kernel from standard 1.6 kernel sources.)
The error with the i386 is so rare that I do not know offhand whether it is accompanied by any console error messages. Sorry. Unlike the AMD64 box, the i386 box does not have a non-USB keyboard, so when its USB dies, it is useless for local access and generally gets reset.
>How-To-Repeat:
Use one of these two machines for a while until the USB subsystem dies.
Sorry, I can't be more specific; the behavior is random. I cannot firmly associate it to busy activity on the USB ports or to system load, or to lots of activity.
For the AMD64, it would help if I knew why a "pipe" has to be set up for a USB device in the middle of normal USB use. Then I might be able to think of some pattern that eludes me.
>Fix:
No solution known.
>Release-Note:
>Audit-Trail:
>Unformatted: