Subject: lock performance on i386
To: None <email@example.com>
From: David Laight <firstname.lastname@example.org>
Date: 09/19/2002 19:30:05
The spin lock code is currently:
static __inline void
int __val = __SIMPLELOCK_LOCKED;
__asm __volatile("xchgl %0, %2"
: "=r" (__val)
: "0" (__val), "m" (*alp));
} while (__val != __SIMPLELOCK_UNLOCKED);
Which means that if the lock is contended then the cpu is
doing continuous, expensive, locked operations on the bus.
Which, IIRC, force a cache snoop? cycle on all the cpus.
It would be much more efficient to use the following:
(in asm, but not __asm...)
1: xchgl (%edx),%eax
2: pause # for P4
cmpl (%edx),%eax # just read until lock available
(inlined you might want the 'ret' at the end...)
Also under debug, maybe take a function call for the lower
loop - that could do some checks for lock contention
if it spins for too long.
David Laight: email@example.com