tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Locking strategy for device deletion (also see PR kern/48536)



   Date: Wed, 8 Jun 2016 19:47:38 +0800 (PHT)
   From: Paul Goyette <paul%whooppee.com@localhost>

   For some reason, the device's open() routine is being called twice, but 
   the close() routine is only called once.  So every iteration leaves the 
   refcount with a net increment of one.

Correct!  But, counterintuitively, this is actually not relevant to
your interests.  There are two cases to consider: typical bdev/cdev
instances (most of them), and cloning devices (a handful: rnd, drm,
putter, &c.).

Typical bdev/cdev instances have no state associated with each open.
Hence there's no need to do additional reference counting, and nothing
special about fooopen/fooclose.  Suppose someone has an open file
descriptor for /dev/rwd1e when wd_modcmd does devsw_detach, and tries
to do xyz at about the same time:

- If a cdev_xyz operation is *currently* in flight, calling wdxyz,
then devsw_detach must wait for control to return from wdxyz back to
cdev_xyz -- hence my advice to do cdevsw_lookup_acquire/cdevsw_release
in cdev_xyz.

- If a cdev_xyz operation is attempted *after* devsw_detach, then
cdevsw_lookup(_acquire) will return NULL and cdev_xyz will fail with
ENXIO.

xyz may be any cdev operation here: open, close, read, write, ioctl,
frobnosticate, &c.  This means that processes may still be trying to
use file descriptors for the device that just vanished -- but that's
OK, they'll just get ENXIO, and you wouldn't want them to block
unloading the module[*].

Cloning devices do have state associated with each open -- and their
fo_close method is called when *that* open is closed.  So you still
don't need to do any additional reference counting in general for that
state.  (Of course, specific cases may refer to some shared state, and
in such cases the module fini must deal with that shared state too.)

In brief, for bdev/cdev instances, the only additional reference
counting you need to do -- in the general sense including global
reference counts, passive references, per-CPU reference counts, &c. --
is for each individual bdev_*/cdev_* operation.


Note that this is all independent of autoconf device state -- i.e.,
device_lookup/device_lookup_private -- but essentially the same
considerations apply: we need to

1. replace device_lookup by device_lookup_acquire,
2. add device_release when done to each caller, and
3. teach config_cfdriver_detach to
   (a) cause device_lookup_acquire calls to fail and then
   (b) wait until all acquired handles have been released.


[*] Such processes may have pending I/O operations in flight too --
not just actively fooread or foowrite, but (e.g.) asynchronous USB
transfers, if we had them.

In that case, before devsw_detach, you may need -- in the specific
case of your driver -- to (a) set a flag (say, uwotsit_dying = 1) that
prevents new I/O from being issued, (b) cancel all pending I/O
operations, wake any condvars that are normally woken only from device
interrupts, and then (c) wait for any threads or workqueues or
whatever associated with these I/O operations to have complete/drain.

Once again, it's another instance of the pattern of (a) prevent new
users, (b) (cancel users and) wait for existing users to drain, and
then (c) close up shop.  Much like the checkout queue at a grocery
store near closing time!


Home | Main Index | Thread Index | Old Index