tech-kern: Re: newlock

Subject: Re: newlock
To: Andrew Doran <ad@NetBSD.org>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-kern
Date: 09/08/2006 07:32:51
Andrew Doran wrote:
> Jason,
>
> On Thu, Sep 07, 2006 at 01:23:51PM -0700, Jason Thorpe wrote:
>
>   
>> First of all, I am assuming that you also picked up my turnstile  
>> implementation and are going to bring it along with the rest of this  
>> stuff?
>>     
>
> Yup. The turnstiles don't do priority inheritance yet, and that's something
> that can be looked at later.
>   

That should be easy to fix, though.  :-)

>   
>> I do, however, think that we should NOT get the *_high() variants.   
>> Instead, we should fix the code that needs to release spin locks in  
>> the wrong order.  This might mean that code delays the transition, but  
>> I think it's better to force fixing it than it is to add a wart to the  
>> API.  Note that this might mean defining a strict hierarchy of  
>> interrupt priority levels... right now, that hierarchy is loose,  
>> except for a few special cases.
>>     
>
> I agree that it's ugly but I also think we'd be shooting ourselves in the
> foot by raising the cost of getting this up and running. I'd prefer have
> mutex_link()/exit_linked() available as a compromise. If we have the
> resources to alter the underlying implementation to deal with this correctly
> later, then at near zero cost we can stub those out and leave them there for
> a while for kernel modules that still know about them.
>   

Can they be stubbed?   If so, then I agree.   But I'd hate to carry
around baggage from a temporary API that we knew was going to temporary
at start.

OTOH, my first gut instinct is that if a framework needs these warts,
then the framework should be fixed/improved before it is committed.

>   
>>      54  * MUTEX_DRIVER is always used by drivers.  mutex_init()  
>> converts this to
>>      55  * either MUTEX_ADAPTIVE or MUTEX_SPIN depending on the  
>> iblock cookie.
>>     
>
> It's there for compatibility and equates to MUTEX_SPIN. Our model is quite
> different, so it's not documented it in the manual page for that reason. I
> can see how it could be useful if we were ever to do interrupts as threads
> or similar, so I'll document it. Please note I'm not suggesting that!
>   
Please make it have a different run-time value.  It can be treated as
MUTEX_SPIN, but by having a different value we don't have to break an
ABI later.  (This will be one less thing to fix when we move to a real LKM.)

>   
>> The documentation implies that holding an adaptive mutex across a  
>> sleep is not allowed.  Actually, it is allowed, but of course leads to  
>> increased contention for the mutex.  That said, some things might need  
>> to hold other things off while it sleeps waiting for some condition,  
>> so it should be officially allowed by the API.
>>     
>
> In general it's something that you really want to avoid and that's what the
> paragraph was intended to convey. I'll re-word it.
>   

Certainly while these locks are spin locks, that is true.  If we have
real adaptive mutexes and priority inheritance, then it is much less of
an issue.  Of course, its _always_ a good idea to reduce lock contention
in your design.

>   
>> I don't think we should have adaptive vs. spin-only rwlocks.  Solaris  
>> only has sleeper-style rwlocks, and we ourselves only have spinlockmgr 
>> () for a few special cases.
>>     
>
> I'd like to do that, it would remove a lot of complication. Looking at it
> now, the proclist lock is easy enough to deal with, but some pmaps also use
> spinlockmgr() and I'm not sure how to deal with those yet.
>   

I agree wholeheartedly here.  rwlocks are an optimization, and they
shouldn't be used in spin form.  There is another problem with rwlocks
in Solaris, which is that they don't support priority inheritance.  As a
result, I generally try not use them in my designs unless the
application really calls for a lot of readers and almost never a writer.


-- 
Garrett D'Amore, Principal Software Engineer
Tadpole Computer / Computing Technologies Division,
General Dynamics C4 Systems
http://www.tadpolecomputer.com/
Phone: 951 325-2134  Fax: 951 325-2191