tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: A small improvement for LOCKDEBUG



On Sat, Apr 04, 2009 at 09:06:43AM +0100, David Laight wrote:
> On Sat, Apr 04, 2009 at 06:50:25AM +0000, Andrew Doran wrote:
> > We a set of adaptive mutexes and rwlocks in the kernel that are subject to
> > constraints that are not well documented. Some examples:
> > 
> > - can be acquired by the VM system, like proc::p_lock, bufcache_lock.
> > - can be acquired from soft interrupts.
> > - must not be held "long term" for other reasons.
> > 
> > So allocating memory or waiting long term with these locks held can cause
> > deadlock.
> > 
> >     p = curproc;
> >     mutex_enter(p->p_lock);
> >     /* this can recurse on p_lock or stall callout processing, etc */
> >     x = kmem_alloc(sizeof(*x), KM_SLEEP);
> >     mutex_exit(p->p_lock);
> > 
> > We don't have good runtime checks to catch this. An idea that occured to me
> > was to add a mutex type (and rwlock type) that says one must not sleep long
> > term with the lock held. The type would only be of interest to the LOCKDEBUG
> > code, which then watch out for ugly situations like the above.
> 

Sounds like a good idea to me, Andrew.

> That might work, if we:
> 1) Just count the number of such locks an lwp holds
> 2) Add a check to places that might sleep (long term) but don't usually
>    (eg kmem_alloc) as well as checking when an lwp is actually asleep

I thought the idea was for a lwp to say "i'm holding a lock of type
please-dont-hold-me-for-too-long and i've been sleeping too long"
rather than annotating the call sites of functions that could
potentially cause long sleeping?

Or was your idea to be more preemptive and basically say "you have the
potential to sleep a long time here, be safe and bomb now"?

> 3) Error if a 'normal' lock is acquired when the count is non-zero
> 4) Mark some lwps (like the soft int ones) as having a non-zero count.
> 
> Perhaps this sort of lock should be the default!
> So we'd have some 'sleep' locks that can be held across condvar wait,
> whereas it would be illegal to wait with the adaptive/rwlock held.
> 

But shouldn't it be OK to sleep with an adaptive/rwlock held as long as
the time you sleep for is less than some heuristically chosen value?


Home | Main Index | Thread Index | Old Index