NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/48151: USB race condition can lead to panic
>Number: 48151
>Category: kern
>Synopsis: USB race condition can lead to panic
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 23 16:45:00 +0000 2013
>Originator: Aymeric Vincent
>Release: -current 6.99.23
>Organization:
>Environment:
NetBSD z.local 6.99.23 NetBSD 6.99.23 (Z) #12: Thu Aug 22 16:04:56 CEST 2013
vincent@z.local:/home/vincent/netbsd/src/sys/arch/amd64/compile/Z amd64
>Description:
usb_transfer_complete() releases the pipe lock to call the callback associated
with a xfer.
Such callbacks may (see usbd_bulk_transfer_cb()) cv_broadcast() on the xfer's
cv.
This will lead to usbd_transfer() being awoken and returning, which in turn
will lead to the xfer being freed (and reused to confuse debugging).
All this while the xfer is still being used by usb_transfer_complete() and a
crash occurs when it tries to use it again because the xfer is in an
inconsistent state (being freed/reallocated/reused).
>How-To-Repeat:
Transfer lots of data for several minutes through USB (in my case using ugen),
and notice a panic is triggered.
After enabling some debug output, notice that usually, ehci_idone() finishes
before usbd_bulk_transfer() resumes:
ehci_idone: xfer=0xfffffe810ff7b8a8, pipe=0xfffffe810cde7d50 ready
ehci_idone: len=1024, actlen=1024, status=0x8d00
usb_freemem: large free
ehci_idone: ex=0xfffffe810ff7b8a8 done
usbd_bulk_transfer: transferred 1024
... except just before the crash, where ehci_idone() does not finish before the
same xfer gets reused:
ehci_idone: xfer=0xfffffe810ff7b8a8, pipe=0xfffffe810cde7d50 ready
ehci_idone: len=1024, actlen=1024, status=0x8d00
usb_freemem: large free
usbd_bulk_transfer: transferred 1024
panic: uskbedr_nbeull kd_itargannossfteirc: asstsaerrtt iotnr a"ncsvf_iesr_ va1
l0i2d4( cbv)" tfeasi
led: ufslbe_ a"l.l.o/c.m.e/m.:. /l.a.r/gkee ranl/lkoecr n1_0c2o4n
dvar.c", line 340
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x136
kern_assert() at netbsd:kern_assert+0x48
cv_broadcast() at netbsd:cv_broadcast+0x53
usb_transfer_complete() at netbsd:usb_transfer_complete+0x253
ehci_idone() at netbsd:ehci_idone+0x19d
ehci_softintr() at netbsd:ehci_softintr+0x40
usb_soft_intr() at netbsd:usb_soft_intr+0x24
softint_dispatch() at netbsd:softint_dispatch+0xd9
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe80a21b7d70
Xsoftintr() at netbsd:Xsoftintr+0x4f
>Fix:
I submit this PR because it's not clear to me what is the right solution. The
cv_broadcast/wakeup is done in two places :
in the callbacks called by usb_transfer_complete()
right after, directly in usb_transfer_complete()
Removing one of the two is required, but I don't know which is intended
conceptually. Removing any of the two should fix the race because the only
reason the xfer is reused in usb_transfer_complete() is to do this
cv_broadcast()...
Home |
Main Index |
Thread Index |
Old Index