Subject: ucom read error restart
To: None <tech-kern@netbsd.org>
From: Darrin B. Jewell <dbj@netbsd.org>
List: tech-kern
Date: 07/28/2004 01:29:50
--=-=-=


For reasons I have not yet been able to diagnose, uplcom at uhci
on my machine gets spurious CRCTO|STALLED errors on its bulk-in endpoint
when idle.  Unfortunately, the ucom.c ucomreadcb function does a
rather poor job of error recovery and when this happens a device close
and reopen is required to reset it.

The following patch tries an immediate restart.  Does this look ok?
I'm worried that in some situations the restart without delay could
cause it to spin.  Is that a likely scenario?  Would it lock anything,
or just gratuitously use cpu cycles until the device is closed?

This patch allows uplcom to be usable on my system.  I'd like to
commit it.


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=ucom.c.diff
Content-Description: restart on ucom read error

Index: ucom.c
===================================================================
RCS file: /cvsroot/src/sys/dev/usb/ucom.c,v
retrieving revision 1.52
diff -u -r1.52 ucom.c
--- ucom.c	24 Nov 2003 19:47:07 -0000	1.52
+++ ucom.c	28 Jul 2004 05:23:18 -0000
@@ -1050,7 +1050,7 @@
 
 	DPRINTFN(5,("ucomreadcb: status=%d\n", status));
 
-	if (status == USBD_CANCELLED || status == USBD_IOERROR ||
+	if (status == USBD_CANCELLED || status == USBD_NOT_STARTED ||
 	    sc->sc_dying) {
 		DPRINTF(("ucomreadcb: dying\n"));
 		/* Send something to wake upper layer */
@@ -1061,10 +1061,11 @@
 		return;
 	}
 
-	if (status) {
+	if (status != USBD_NORMAL_COMPLETION) {
+		DPRINTF(("ucomreadcb: status %s\n", usbd_errstr(status)));
 		usbd_clear_endpoint_stall_async(sc->sc_bulkin_pipe);
-		/* XXX we should restart after some delay. */
-		return;
+		/* XXX should we delay before restart to avoid spinning? */
+		goto restart;
 	}
 
 	usbd_get_xfer_status(xfer, NULL, (void *)&cp, &cc, NULL);
@@ -1089,6 +1090,7 @@
 	}
 	splx(s);
 
+ restart:
 	err = ucomstartread(sc);
 	if (err) {
 		printf("%s: read start failed\n", USBDEVNAME(sc->sc_dev));

--=-=-=


The machine also gets the one or two of these same errors on
its uplcom interrupt endpoint per second, even when its not idle.
I'd like to figure out what's causing that, but that's a separate
problem.

Thanks,
Darrin

--=-=-=--