tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: usbdi(9) improvements
New patch to fix abort races and help USB device drivers -- with a
less intrusive change to aborting pipes for now. Since this still
changes the usbdi(9) API and ABI, though, it will require a kernel
version bump.
Review, testing, and comments welcome!
Changes to usbdi(9):
- The following functions which previously returned a usbd_status but
never failed in practice now return void:
. usbd_close_pipe
. usbd_abort_pipe
. usbd_abort_default_pipe
- usbd_transfer no longer races with usbd_abort_pipe to detect when
the pipe is currently being aborted.
- New functions usbd_suspend_pipe and usbd_resume_pipe.
. usbd_suspend_pipe fails any new and future xfers with
USBD_CANCELLED.
. usbd_resume_pipe allows xfers on a suspended pipe again.
The existing function usbd_abort_pipe is now the same as doing
usbd_suspend_pipe and then usbd_resume_pipe -- no change in its
semantics.
For many drivers, the difference between usbd_abort_pipe and
usbd_suspend_pipe is immaterial because they only
open/transfer/abort/close, in that order.
But there are some drivers like ucom(4) which (a) open pipes on
attach, (b) abort on pipes on close, (c) expect to be able to use
the pipes again on reopen, and (d) only close pipes on detach.
These drivers need the effect of usbd_abort_pipe to be transient.
There are also some drivers that need usbd_suspend_pipe semantics
(persistently cancel xfers on the pipe), and _not_ usbd_abort_pipe
semantics (cancel current xfers, allow future xfers). Specifically,
drivers that issue _synchronous_ transfers like usbd_sync_transfer
or usbd_bulk_transfer need usbd_suspend_pipe to avoid a race between
I/O and abort:
/* I/O thread */
if (sc->sc_dying)
return EIO;
// (*)
err = usbd_bulk_transfer(...);
/* abort thread */
sc->sc_dying = true;
usbd_abort_pipe(...);
wait_for_io_thread(...);
usbd_destroy_xfer(...);
If the abort thread runs at the time of (*) in the I/O thread, this
is a deadlock. Using usbd_suspend_pipe instead of usbd_abort_pipe
resolves this deadlock -- usbd_bulk_transfer will fail with
USBD_CANCELLED in this case.
=> The previous patch series made the effect of usbd_abort_pipe
itself persistent, but that changed the semantics in a way that
would break drivers, and I don't want to have to audit all the
drivers -- and, e.g., change ucom(4) to open pipes on open
instead of attach, and close pipes on close instead of detach --
before netbsd-10.
=> Another alternative is to give usbd_bulk_transfer an interlock
like cv_wait, so sc_dying and the transfer/abort can be done
under the interlock to avoid a race. The usbd_suspend_pipe
semantics seems like a simpler way to solve this problem.
Changes to host controller interface API:
- struct usbd_pipe_methods::upm_transfer is no longer responsible for
calling usb_insert_transfer to insert the transfer in the pipe's
queue. usbd_transfer takes care of that itself.
This was necessary to fix the usbd_transfer / usbd_abort_pipe race
mentioned above.
- struct usbd_pipe_methods::upm_transfer and upm_start are now called
under the pipe lock.
The logic in usbd_transfer is now basically:
usbd_lock_pipe(pipe);
if (pipe->up_aborting) fail with USBD_ECANCELLED;
SIMPLEQ_INSERT(pipe->up_queue, xfer);
pipe->up_methods->upm_transfer(xfer);
// usually calls upm_start
usbd_unlock_pipe(pipe);
This fixes racy access to the pipe's queue, which could interact
badly with abort (and might explain any of various obscure USB crash
PRs we have open).
- struct usbd_pipe_methods::upm_transfer is forbidden from returning
USBD_NORMAL_COMPLETION for asynchronous xfers. This would imply
that the xfer has completed, which must not happen synchronously,
because the caller may hold locks that the callback needs to take,
and doing that synchronously would lock-against-self.
This rule was already implicit in the logic, but there is now a
KASSERT to detect it.
- struct usbd_bus_methods::ubm_abortx is no longer responsible for
calling usb_transfer_complete -- usbd_xfer_abort now does that.
Note that .ubm_abortx MUST NOT release the pipe (bus) lock; this was
a bug in xhci(4) abort.
- struct usbd_bus_methods::ubm_rhctrl is now called under the bus
lock.
(I committed the xhci(4) abort/clear-endpoint-stall bug fix that was
included in the previous patch series -- it does not depend on any of
this, and it may be pullable-up to netbsd-9 this way.)
Home |
Main Index |
Thread Index |
Old Index