Subject: Re: Restartable Atomic Sequences
To: Gregory McGarry <g.mcgarry@ieee.org>
From: Simon Burge <simonb@wasabisystems.com>
List: tech-kern
Date: 08/27/2002 12:42:18
On Thu, Jul 04, 2002 at 01:06:35PM +1200, Gregory McGarry wrote:
> Since I don't have an R4000+ cpu to compare the performance, I tested
> on a DX4 (i386). I compared three approaches:
>
> 1) kernel acquire and release systems calls using sysarch(2)
>
> 2) RAS using explicit registration with a system call using sysarch(2)
>
> 3) __cpu_simple_lock() from /usr/include/machine/lock.h. This
> function is inlined.
>
> Here are the run times (in seconds) for 1E6 iterations of locking and
> unlocking a lock using the three approaches:
>
> 1) 21.570237
> 2) 1.046555
> 3) 0.946119
>
> Nothing really surprising here I suppose. Don't use system calls.
> Actually, I did trim an instruction off the lock code which improves
> the RAS result here. You can see the improvement below.
Here's results for a couple of MIPS cpus:
bcm1250 (mips64 500MHz)
syscall lock: time= 7.174320
RAS lock: time= 0.220124
test-and-set lock: time= 0.450196
au1000 (mips32 396MHz)
syscall lock: time=12.292677
RAS lock: time= 0.227705
test-and-set lock: time= 0.531112
r4400 (mips3 120MHz)
syscall lock: time=48.880966
RAS lock: time= 1.048473
test-and-set lock: time= 1.090512
So RAS is faster in all cases. I'm thinking of something like:
if sysctl(hw.ncpus == 1) {
RAS lock
} else {
if (sysctl(machdep.llsc) {
test-and-set lock
} else {
/* Eek! SMP without ll/sc */
syscall lock
}
}
Simon.
--
Simon Burge <simonb@wasabisystems.com>
NetBSD Development, Support and Service: http://www.wasabisystems.com/