tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Locking strategy for device deletion (also see PR kern/48536)



On Tue, Jun 07, 2016 at 06:28:11PM +0800, Paul Goyette wrote:
> Can anyone suggest a reliable way to ensure that a device-driver
> module can be _really_ safely detached?
> 
> The module could theoretically maintain an open/ref counter, but
> making this MP-safe is "difficult"!  Even if the module were to
> provide a mutex to control increment/decrement of it's counter,
> there's still a problem:
> 
> Thread 1 initiates a module-unload, which takes the mutex
> 
> Thread 2 attempts to open the device (or one of its units), attempts to
> grab the mutex, and waits
> 
> Back in thread 1, the driver's module unload code determines that it
> is safe to unload (no current activites queued, no current opens),
> so it
> goes forward and unmaps the module - including the mutex!

I think that what's missing is a flag on the module that says it is
unloading, and module entrance/exit counters.  I think it could work
sort of like this---the devil is in the details:

Thread 1 initiates a module unload:
	1) Acquires mutex
	2) Sets the module's unloading flag
	3) Unlinks module entry points---that is, they're still mapped,
	   but there are no more globally-visible pointers to them
	4) While module entrances > exits, sleeps on module condition
	   variable C, thus temporarily releasing mutex
	5) Releases mutex
	6) Unmaps module

Thread 2 attempts to open the device
	1) Increases module-entrance count
	2) Acquires mutex
	3) Examines unloading flag
		a) Finding it set, signals condition variable C,
		b) OR, finding it NOT set, performs open
	4) increases module-exit count
	5) releases mutex

The module entrance/exit counts can be per-CPU variables that you
increment using non-interlocked atomic instructions, which are not very
expensive.

Now, I am trying to remember if/why counting entrances and exits
separately is necessary.  ISTM that to avoid races, you want to add up
exits across all CPUs, first, then add up entrances, and compare.

This is not necessarily the best or only way to handle this, and I feel
sure that I've overlooked a fatal flaw in this first draft.

Dave

-- 
David Young
dyoung%pobox.com@localhost    Urbana, IL    (217) 721-9981


Home | Main Index | Thread Index | Old Index