tech-kern: Re: newlock

Subject: Re: newlock
To: Jason Thorpe <thorpej@shagadelic.org>
From: Andrew Doran <ad@netbsd.org>
List: tech-kern
Date: 09/08/2006 14:45:38
Jason,

On Thu, Sep 07, 2006 at 01:23:51PM -0700, Jason Thorpe wrote:

> First of all, I am assuming that you also picked up my turnstile  
> implementation and are going to bring it along with the rest of this  
> stuff?

Yup. The turnstiles don't do priority inheritance yet, and that's something
that can be looked at later.

> I do, however, think that we should NOT get the *_high() variants.   
> Instead, we should fix the code that needs to release spin locks in  
> the wrong order.  This might mean that code delays the transition, but  
> I think it's better to force fixing it than it is to add a wart to the  
> API.  Note that this might mean defining a strict hierarchy of  
> interrupt priority levels... right now, that hierarchy is loose,  
> except for a few special cases.

I agree that it's ugly but I also think we'd be shooting ourselves in the
foot by raising the cost of getting this up and running. I'd prefer have
mutex_link()/exit_linked() available as a compromise. If we have the
resources to alter the underlying implementation to deal with this correctly
later, then at near zero cost we can stub those out and leave them there for
a while for kernel modules that still know about them.

>      54  * MUTEX_DRIVER is always used by drivers.  mutex_init()  
> converts this to
>      55  * either MUTEX_ADAPTIVE or MUTEX_SPIN depending on the  
> iblock cookie.

It's there for compatibility and equates to MUTEX_SPIN. Our model is quite
different, so it's not documented it in the manual page for that reason. I
can see how it could be useful if we were ever to do interrupts as threads
or similar, so I'll document it. Please note I'm not suggesting that!

> The documentation implies that holding an adaptive mutex across a  
> sleep is not allowed.  Actually, it is allowed, but of course leads to  
> increased contention for the mutex.  That said, some things might need  
> to hold other things off while it sleeps waiting for some condition,  
> so it should be officially allowed by the API.

In general it's something that you really want to avoid and that's what the
paragraph was intended to convey. I'll re-word it.

> I don't think we should have adaptive vs. spin-only rwlocks.  Solaris  
> only has sleeper-style rwlocks, and we ourselves only have spinlockmgr 
> () for a few special cases.

I'd like to do that, it would remove a lot of complication. Looking at it
now, the proclist lock is easy enough to deal with, but some pmaps also use
spinlockmgr() and I'm not sure how to deal with those yet.

> >Note that in the non-DIAGNOSTIC / MULTIPROCESSOR case, the spin locks
> >basically boil down to at most splraiseipl()/splx(). The adaptive  
> >locks
> >remain, however. I don't see that as a big problem:
> 
> Not only is it not a problem, it's a requirement so that LKMs can work  
> in UP -or- MP kernels :-)

Right, that was one of my goals. For LOCKDEBUG I'm tempted to argue but I
can think of a couple of ways around it.

Thanks,
Andrew