Subject: kern/25041: USB isochronous transfers
To: None <tech-kern@netbsd.org>
From: Iain Hibbert <plunky@rya-online.net>
List: tech-kern
Date: 05/03/2006 17:16:09
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-126104382-1145398884=:8548
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-ID: <Pine.NEB.4.63.0604212117002.28464@localhost.>

Hi,
   I find when attempting to do USB isochronous transfers, the machine is
locking up in an interrupt loop. Well, I found the particular trouble, not
sure what to suggest about it, however.

in dev/usb/usbdi.c, function usb_transfer_complete() is called when a USB
transfer is complete, it does some cleanup, then calls the upper driver
callback

	if (xfer->callback)
		xfer->callback(xfer, xfer->priv, xfer->status);

then it calls the lower device method done routine, to do some cleanup.

	pipe->methods->done(xfer);

but in most cases so far as I can see (ugen and uaudio for isoc), the
callback function has already recycled the xfer. I wonder then, is it
supported to recycle the xfer in the callback function? There are several
PRs that seem to relate to this,

   1. http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=22218
      "USB isoc locks"

	this relates to ohci.c but its the same issue, that the xfer has
been recycled. I guess that this PR should be closed, since the suggested
fix was actioned (ohci.c 1.150-1.151) some time ago.

   2. http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=25041
      "Isochronous USB transfers from ugen hang on DIAGNOSTIC"

	this is the exact problem I have and so far as I can see, the
internals of the xfer structure are not relevant at this time (its on
queue rather than busy because its been requeued) so it makes no point to
act differently inside DIAGNOSTIC. My fix is attached, to remove that
check entirely.

   3. http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=25960
      "usb ugen(4) fix/enhancement"

	this is where the fix for pr/22218 above came from.

   4. http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=24636
      "potential invalid memory access"

	Another one that could probably be closed? the changes mentioned
here have already been actioned and he mentions the issue of the done
method touching the xfer though not quite in this context.

Looking deeper, ehci.c and ohci.c do nothing in the ->done function so
its not an issue there, but uhci.c uses it to cancel the interrupt, which
I guess is ok as its only queued at this stage and not actually active. It
definitely needs cancelling even though the xfer is queued again as when
it is not, the system locks up in an interrupt loop.

Now, I dont know if it can happen that the transfer can have passed
through the queue and be busy again at this point. If there is a USB
expert then maybe they could consider that?

I also wonder if it would be better to switch around the ->callback and
->done calls so that the upper layer is not called until the hardware is
finished? I dont know if that would be safe however, there may be other
issues.

iain
--0-126104382-1145398884=:8548
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=diff
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.NEB.4.63.0604182321240.8548@localhost.>
Content-Description: uhci.c diff
Content-Disposition: ATTACHMENT; FILENAME=diff

SW5kZXg6IHVoY2kuYw0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KUkNTIGZp
bGU6IC9jdnNyb290L3NyYy9zeXMvZGV2L3VzYi91aGNpLmMsdg0KcmV0cmll
dmluZyByZXZpc2lvbiAxLjE5NA0KZGlmZiAtdSAtcjEuMTk0IHVoY2kuYw0K
LS0tIHVoY2kuYwkzMSBNYXIgMjAwNiAxNzozMToxMyAtMDAwMAkxLjE5NA0K
KysrIHVoY2kuYwkxOCBBcHIgMjAwNiAyMTo1Njo0OCAtMDAwMA0KQEAgLTI2
NTYsNyArMjY1Niw4IEBADQogew0KIAl1aGNpX2ludHJfaW5mb190ICppaSA9
ICZVWEZFUih4ZmVyKS0+aWluZm87DQogDQotCURQUklOVEZOKDQsICgidWhj
aV9pc29jX2RvbmU6IGxlbmd0aD0lZFxuIiwgeGZlci0+YWN0bGVuKSk7DQor
CURQUklOVEZOKDQsICgidWhjaV9pc29jX2RvbmU6IGxlbmd0aD0lZCwgYnVz
eV9mcmVlPTB4JTA4eFxuIiwNCisJCQl4ZmVyLT5hY3RsZW4sIHhmZXItPmJ1
c3lfZnJlZSkpOw0KIA0KIAlpZiAoaWktPnhmZXIgIT0geGZlcikNCiAJCS8q
IE5vdCBvbiBpbnRlcnJ1cHQgbGlzdCwgaWdub3JlIGl0LiAqLw0KQEAgLTI2
NjYsMTIgKzI2NjcsNiBAQA0KIAkJcmV0dXJuOw0KIA0KICNpZmRlZiBESUFH
Tk9TVElDDQotCWlmICh4ZmVyLT5idXN5X2ZyZWUgIT0gWEZFUl9CVVNZKSB7
DQotCQlwcmludGYoInVoY2lfZGV2aWNlX2lzb2NfZG9uZTogeGZlcj0lcCBu
b3QgYnVzeSAweCUwOHhcbiIsDQotCQkgICAgICAgeGZlciwgeGZlci0+YnVz
eV9mcmVlKTsNCi0JCXJldHVybjsNCi0JfQ0KLQ0KICAgICAgICAgaWYgKGlp
LT5zdGRlbmQgPT0gTlVMTCkgew0KICAgICAgICAgICAgICAgICBwcmludGYo
InVoY2lfZGV2aWNlX2lzb2NfZG9uZTogeGZlcj0lcCBzdGRlbmQ9PU5VTExc
biIsIHhmZXIpOw0KICNpZmRlZiBVSENJX0RFQlVHDQo=

--0-126104382-1145398884=:8548--