Re: Locking strategy for device deletion (also see PR kern/48536)

To: David Young <dyoung%pobox.com@localhost>
Subject: Re: Locking strategy for device deletion (also see PR kern/48536)
From: Paul Goyette <paul%whooppee.com@localhost>
Date: Wed, 8 Jun 2016 05:53:55 +0800 (PHT)

On Tue, 7 Jun 2016, David Young wrote:

On Tue, Jun 07, 2016 at 06:28:11PM +0800, Paul Goyette wrote:

Can anyone suggest a reliable way to ensure that a device-driver
module can be _really_ safely detached?

The module could theoretically maintain an open/ref counter, but
making this MP-safe is "difficult"!  Even if the module were to
provide a mutex to control increment/decrement of it's counter,
there's still a problem:

Thread 1 initiates a module-unload, which takes the mutex

Thread 2 attempts to open the device (or one of its units), attempts to
grab the mutex, and waits

Back in thread 1, the driver's module unload code determines that it
is safe to unload (no current activites queued, no current opens),
so it
goes forward and unmaps the module - including the mutex!


I think that what's missing is a flag on the module that says it is
unloading, and module entrance/exit counters.  I think it could work
sort of like this---the devil is in the details:

Thread 1 initiates a module unload:
	1) Acquires mutex
	2) Sets the module's unloading flag
	3) Unlinks module entry points---that is, they're still mapped,
	   but there are no more globally-visible pointers to them
	4) While module entrances > exits, sleeps on module condition
	   variable C, thus temporarily releasing mutex
	5) Releases mutex
	6) Unmaps module

Thread 2 attempts to open the device
	1) Increases module-entrance count
	2) Acquires mutex
	3) Examines unloading flag
		a) Finding it set, signals condition variable C,
		b) OR, finding it NOT set, performs open
	4) increases module-exit count
	5) releases mutex

The module entrance/exit counts can be per-CPU variables that you
increment using non-interlocked atomic instructions, which are not very
expensive.

Now, I am trying to remember if/why counting entrances and exits
separately is necessary.  ISTM that to avoid races, you want to add up
exits across all CPUs, first, then add up entrances, and compare.

This is not necessarily the best or only way to handle this, and I feel
sure that I've overlooked a fatal flaw in this first draft.


Some comments:

* Thread 2 doesn't know that the device's code resides in a module
* I assume that the module-entrance and module-exit counts would each be
  a single per-CPU variable, not specific to each module?
* It "feels" rather complicated.  Of course, that might be a "good
  thing" (tm) since MP-safe interlocking is complicated.  :)


+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+

References:
- Locking strategy for device deletion (also see PR kern/48536)
  - From: Paul Goyette
- Re: Locking strategy for device deletion (also see PR kern/48536)
  - From: David Young

Prev by Date: Re: /dev/sdN -> /dev/sdN[cd] (was: port-amd64/51216: Can't create wedges on a large (3TB) disk, gpt is ok but dkctl gives an error message)
Next by Date: Re: modules: per-segment policy
Previous by Thread: Re: Locking strategy for device deletion (also see PR kern/48536)
Next by Thread: re: Locking strategy for device deletion (also see PR kern/48536)
Indexes:

Home | Main Index | Thread Index | Old Index