Subject: Re: newlock
To: Charles M. Hannum <mycroft@MIT.EDU>
From: David Laight <david@l8s.co.uk>
List: tech-kern
Date: 09/02/2006 19:18:21
On Sat, Sep 02, 2006 at 11:56:01AM -0400, Charles M. Hannum wrote:
> On Sat, Sep 02, 2006 at 04:22:15PM +0100, Andrew Doran wrote:
> > For an example of memory
> > usage, on i386 a mutex is 4 bytes and a RW lock is 16 bytes.
> 
> But you have got to stop putting mutexes in other data structures
> like this.  It causes terrible cache behavior.  (Do I actually need
> to explain this?)

yes...

If the mutex is in the data, then acquiring the mutex brings (some of)
the locked data into the local cache - one vote in favour.

So any issues must be when the lock is 'contended'.

If we contend for the lock by spinning on the locked memory cycle then
we f*ck the memory bus whenever the lock contends...

If we loop on a normal memory read then normally the cache line will stay
valid in both (all) of the cpus' cache.
A cache snoop will only happen when the cpu holding the mutex writes to
the cache line.  Since this ought to be a 'read' snoop, it should only
(really) slow down the cpu that is trying to acquire the lock.

OTOH if the 'lock' is a pointer to another cache line, then it a wasted
cache line fetch - which may displace some useful data.
Additionally if two locks that aren't often contended end up in the same
cache line, that line itself becomes a memory hot-spot.

Of course, if the 'lock' does point elsewhere, it makes the DEBUG and
DIAGNOSTIC code not affect the driver structures.

	David

-- 
David Laight: david@l8s.co.uk