tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Locking strategy for device deletion (also see PR kern/48536)
Date: Wed, 8 Jun 2016 17:34:33 +0000
From: David Holland <dholland-tech%netbsd.org@localhost>
On Tue, Jun 07, 2016 at 06:28:11PM +0800, Paul Goyette wrote:
> Can anyone suggest a reliable way to ensure that a device-driver module can
> be _really_ safely detached?
There's a pserialize scheme for this; see e.g. an old thread called
"kicking everybody out of the softc".
Skimming that thread prompted me to write a sketch -- attached -- of
the per-CPU counts I've mentioned several times in this thread, which
is a bit simpler than dyoung@'s proposal back in 2010.
The reason I use per-CPU counts here instead of psref(9) or per-CPU
reference lists is to allow the caller of device_lookup_acquire to
migrate to another CPU before device_release, which is fairly likely
for operations that wait for I/O, but which is forbidden for psref(9).
The catch for arbitrary drivers is that you need to do the enter
business using the softc *before* using [cb]devsw to jump into the
driver, but the current way all of that works is the other way
around. This really needs to be restructured...
This is not a catch -- this is just the distinction between autoconf
device state (device_lookup) and userland-exposed device nodes
(bdev/cdev). Detach both before unmapping the module code and you're
safe.
I think even the order doesn't even matter: if you detach the autoconf
device first, then xyzread's call to device_lookup will fail; if you
detach the devsw first, then no more calls to xyzread can happen. But
there may be other constraints that imply some order, which I don't
have in my head at the moment.
struct device {
...
struct percpu *dv_usecounts; /* int64_t */
int64_t *dv_usecount_total;
bool dv_dying;
...
};
/*
* device_diediedie:
*
* Prevent new references to dv and wait for all existing
* references to drain. Caller must hold alldevs_mtx, which will
* be released during the wait.
*
* XXX To be called from config_devunlink, I think, as the first
* thing it does.
*/
device_t
device_diediedie(device_t dev)
{
int64_t total = 0;
KASSERT(mutex_owned(&alldevs_mtx));
KASSERT(!dev->dv_dying);
/* Mark it complete and record where the total tally goes. */
dev->dv_usecount_total = &total;
dev->dv_dying = true;
/*
* Count up the total on all CPUs. xc_broadcast serves as a
* global memory barrier to ensure visibility of dv_dying = 1
* on all CPUs before the cv_wait.
*/
mutex_exit(&alldevs_mtx);
xc_broadcast(0, &device_usecount_xc, dev, NULL);
mutex_enter(&alldevs_mtx);
KASSERTMSG((0 <= total), "negatively referenced device: %p, %"PRId64,
dev, total);
while (total != 0)
cv_wait(&device_release_cv, &alldevs_mtx);
/* Cause subsequent attempts to add to the total to crash. */
dev->dv_usecount_total = NULL;
}
static void
device_usecount_xc(void *cookie0, void *cookie1 __unused)
{
device_t dev = cookie0;
int64_t *usecountp;
mutex_enter(&alldevs_mtx);
usecountp = percpu_getref(dev->dv_usecounts);
*dev->dv_usecount_total += *usecountp;
percpu_putref(dev->dv_usecounts);
mutex_exit(&alldevs_mtx);
}
/*
* device_lookup_acquire:
*
* Look up a device instance for a given driver and add a
* reference to it.
*
* If found, return the device private, and store the device
* pointer in *devp, which the caller must later release with
* device_release. Otherwise, return NULL and store NULL in
* *devp.
*
* XXX In principle, this could be done with pserialize(9) and
* involve no interprocessor synchronization at all, by changing
* the management of cd->cd_ndevs and cd->cd_devs.
*/
void *
device_lookup_acquire(cfdriver_t cd, int unit, device_t *devp)
{
device_t dev;
mutex_enter(&alldevs_mtx);
if (unit < 0 || unit >= cd->cd_ndevs) {
dev = NULL;
} else if ((dev = cd->cd_devs[unit]) != NULL && dev->dv_del_gen != 0) {
dev = NULL;
} else if (dev->dv_dying) {
dev = NULL;
} else {
int64_t *usecountp = percpu_getref(dev->dv_usecounts);
*usecountp += 1;
percpu_putref(dev->dv_usecounts);
}
mutex_exit(&alldevs_mtx);
*devp = dev;
return dev ? device_private(dev) : NULL;
}
/*
* device_release:
*
* Release a device acquired with device_lookup_acquire. If dev
* is not in the process of dying, involves no interprocessor
* synchronization.
*/
void
device_release(device_t dev)
{
int64_t *usecountp;
if (__predict_false(dv->dv_dying)) {
/*
* Slow path: adjust local delta and global total in
* serial with respect to device_usecount_xc so we
* don't inadvertently double-count.
*
* Broadcast unconditionally, not conditional on
* whether the local delta is zero, because our
* reference may have migrated from another CPU.
*/
mutex_enter(&alldevs_mtx);
usecountp = percpu_getref(dev->dv_usecounts);
*usecountp -= 1;
percpu_putref(dev->dv_usecounts);
*dev->dv_usecount_total -= 1;
cv_broadcast(&device_release_cv);
mutex_exit(&alldevs_mtx);
} else {
/* Fast path: adjust only local delta. */
usecountp = percpu_getref(dev->dv_usecounts);
*usecountp -= 1;
percpu_putref(dev->dv_usecounts);
}
}
Home |
Main Index |
Thread Index |
Old Index