Subject: Re: lock-free data structures
To: Garrett D'Amore <garrett_damore@tadpole.com>
From: Jason Thorpe <thorpej@shagadelic.org>
List: tech-kern
Date: 01/03/2006 21:23:52
On Jan 3, 2006, at 12:34 PM, Garrett D'Amore wrote:
> In my opinion, one of the most scalable kernels around is Solaris
> (proven on 100+ CPU SMP systems), and if getting locking "right" is
> key,
> it would definitely benefit NetBSD if some developers examined the
> Solaris design.
Yah, I've studied it, and even implemented a clone of it on the
"newlock" branch. Never had time to finish it, though. (Needed some
changes to the scheduler which were tricky, time consuming, and not
something I really had time to do...)
> Of course, Solaris uses a hardware test-and-set operation, but they
> also
> use something called a barrier which basically ensures a consistent
> view
> of memory between processors when a boundary is crossed. There is
> also
> some use of atomic math operations, but less so.
It can be done with other than test-and-set. The Solaris model
leaves the low-level primitive to machine-dependent code, so you can
use whatever method works for your CPU. You could even use
restartable atomic sequences on platforms that both: uniprocessor,
lacking a suitable atomic operation (necessary even on uniprocessor
to handle preemption in the kernel).
> The main thing I think that Solaris gets right is lots and lots of
> little locks, coupled with software designed so that very rarely do
> you
> ever fail to get a lock (unless as a reader for a reader/writer
> lock). The locks themselves are optimized so that the common
> case of
> uncontended locks is very fast.
Yes, it is very fast indeed.
> One other thing I like about Solaris is that the interrupt masking
> behavior of certain kinds of locks is hidden behind the interface
> -- so
> callers needn't normally worry about setting the processor mask
> explicitly. (The exception is for "high level" locks, which block out
> the timer interrupt. Typically hi level locks are only used for
> certain
> hotplug devices (PCMCIA) and for serial ports.)
Yah, you can either have an "adaptive mutex" or a "spinning mutex",
the latter also performing an implicit spl operation (based on the
IPL value that is provided at mutex initialization).
-- thorpej