Subject: Re: lock-free data structures
To: Garrett D'Amore <garrett_damore@tadpole.com>
From: Jason Thorpe <thorpej@shagadelic.org>
List: tech-kern
Date: 01/03/2006 21:23:52
On Jan 3, 2006, at 12:34 PM, Garrett D'Amore wrote:

> In my opinion, one of the most scalable kernels around is Solaris
> (proven on 100+ CPU SMP systems), and if getting locking "right" is  
> key,
> it would definitely benefit NetBSD if some developers examined the
> Solaris design.

Yah, I've studied it, and even implemented a clone of it on the  
"newlock" branch.  Never had time to finish it, though.  (Needed some  
changes to the scheduler which were tricky, time consuming, and not  
something I really had time to do...)

> Of course, Solaris uses a hardware test-and-set operation, but they  
> also
> use something called a barrier which basically ensures a consistent  
> view
> of memory between processors when a boundary is crossed.   There is  
> also
> some use of atomic math operations, but less so.

It can be done with other than test-and-set.  The Solaris model  
leaves the low-level primitive to machine-dependent code, so you can  
use whatever method works for your CPU.  You could even use  
restartable atomic sequences on platforms that both: uniprocessor,  
lacking a suitable atomic operation (necessary even on uniprocessor  
to handle preemption in the kernel).

> The main thing I think that Solaris gets right is lots and lots of
> little locks, coupled with software designed so that very rarely do  
> you
> ever fail to get a lock (unless as a reader for a reader/writer
> lock).    The locks themselves are optimized so that the common  
> case of
> uncontended locks is very fast.

Yes, it is very fast indeed.

> One other thing I like about Solaris is that the interrupt masking
> behavior of certain kinds of locks is hidden behind the interface  
> -- so
> callers needn't normally worry about setting the processor mask
> explicitly.  (The exception is for "high level" locks, which block out
> the timer interrupt.  Typically hi level locks are only used for  
> certain
> hotplug devices (PCMCIA) and for serial ports.)

Yah, you can either have an "adaptive mutex" or a "spinning mutex",  
the latter also performing an implicit spl operation (based on the  
IPL value that is provided at mutex initialization).

-- thorpej