tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Locking strategy for device deletion (also see PR kern/48536)



Thanks, everyone, for clarifying this confusing-to-me situation!

For now, I'm just going to maintain a zero/non-zero "active" flag in the driver, setting it on open() and clearing on close().

Unloading of the module will first call devsw_detach(), and then check the flag. If the flag is set, call devsw_attach() to reinsert the device, and fail the unload operation.

All accesses to the flag (set, clear, and check) will be protected by a mutex.


If Taylor's device_lookup_acquire() gets implemented, then I can revisit the synchronization for ipfilter.


Skimming that thread prompted me to write a sketch -- attached -- of
the per-CPU counts I've mentioned several times in this thread, which
is a bit simpler than dyoung@'s proposal back in 2010.

BTW, reading Taylor's sketch, it looks like there are some very small windows where the mutex is released (within the xcall?) after the CPU's count has been added to the total. Is it possible for someone to sneak in and add a new reference before everything is finished?

Also, I'm not quite sure I understand why we have the pointer-to-total and why we have the possibility of setting to NULL (and creating an intentional chance for a crash?)

	/* Cause subsequent attempts to add to the total to crash.  */

Are these two observations related? And would it be reasonable to do a KASSERT(dev->dv_usecount_total != NULL) within the xcall() routine so we would at least get a meaningful panic message? :)


The reason I use per-CPU counts here instead of psref(9) or per-CPU
reference lists is to allow the caller of device_lookup_acquire to
migrate to another CPU before device_release, which is fairly likely
for operations that wait for I/O, but which is forbidden for psref(9).

  The catch for arbitrary drivers is that you need to do the enter
  business using the softc *before* using [cb]devsw to jump into the
  driver, but the current way all of that works is the other way
  around. This really needs to be restructured...

This is not a catch -- this is just the distinction between autoconf
device state (device_lookup) and userland-exposed device nodes
(bdev/cdev).  Detach both before unmapping the module code and you're
safe.

I think even the order doesn't even matter: if you detach the autoconf
device first, then xyzread's call to device_lookup will fail; if you
detach the devsw first, then no more calls to xyzread can happen.  But
there may be other constraints that imply some order, which I don't
have in my head at the moment.


Just wondering here - for pseudo-devices, is it really necessary to remove the autoconf stuff at all? It's not clear to me where it even gets added in the case of modularly-loaded pseudo-devices...



+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+


Home | Main Index | Thread Index | Old Index