Subject: Re: copyin/out
To: Ben Harris <bjh21@netbsd.org>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 08/13/2002 10:13:16
> On Sun, 11 Aug 2002, Ben Harris wrote:
> 
> > I've just run lmbench with the old and new (including my LDM/STM changes)
> > code, on each of my three CPUs.  I seem to be having problems with the
> > ctxsw benchmark, so the results aren't complete, but they're still
> > interesting.  "old" is the current code; "new" is Allen's with my hacks.:
> 
> Now the same thing again, but with the right timing overhead for
> ARM610.old (I used "make rerun" inappropriately).
> 

OK, so I've blanked out the 'good' numbers and left only those where there is a degradation:

> Processor, Processes - times in microseconds - smaller is better
> ----------------------------------------------------------------
> Host                 OS  Mhz null null      open selct sig  sig  fork exec sh
>                              call  I/O stat clos TCP   inst hndl proc proc proc
> --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
> ARM610.ol   NetBSD 1.6E   30                965.
> ARM610.ne   NetBSD 1.6E   30                981.
> ARM710a.o   NetBSD 1.6E   40                695.
> ARM710a.n   NetBSD 1.6E   40                724.
> SA-110.ol   NetBSD 1.6E  233                32.5
> SA-110.ne   NetBSD 1.6E  233                57.9
> 
> *Local* Communication latencies in microseconds - smaller is better
> -------------------------------------------------------------------
> Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
>                         ctxsw       UNIX         UDP         TCP conn
> --------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
> ARM610.ol   NetBSD 1.6E       
> ARM610.ne   NetBSD 1.6E       
> ARM710a.o   NetBSD 1.6E       
> ARM710a.n   NetBSD 1.6E       
> SA-110.ol   NetBSD 1.6E                                          4205
> SA-110.ne   NetBSD 1.6E                                          8461
> 
> File & VM system latencies in microseconds - smaller is better
> --------------------------------------------------------------
> Host                 OS   0K File      10K File      Mmap    Prot    Page
>                         Create Delete Create Delete  Latency Fault   Fault
> --------- ------------- ------ ------ ------ ------  ------- -----   -----
> ARM610.ol   NetBSD 1.6E                                       27.6   14.5K
> ARM610.ne   NetBSD 1.6E                                       67.6   15.6K
> ARM710a.o   NetBSD 1.6E                                       32.8   14.7K
> ARM710a.n   NetBSD 1.6E                                       47.6   15.0K
> SA-110.ol   NetBSD 1.6E                                       29.4   14.7K
> SA-110.ne   NetBSD 1.6E                                       43.6   15.2K
> 
> *Local* Communication bandwidths in MB/s - bigger is better
> -----------------------------------------------------------
> Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
>                              UNIX      reread reread (libc) (hand) read write
> --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
> ARM610.ol   NetBSD 1.6E 
> ARM610.ne   NetBSD 1.6E 
> ARM710a.o   NetBSD 1.6E 
> ARM710a.n   NetBSD 1.6E 
> SA-110.ol   NetBSD 1.6E                         39.4
> SA-110.ne   NetBSD 1.6E                         39.0
> 

Now it would be interesting to know why those numbers (particularly the protection fault timings) have degraded with this change.  Those and the TCP conn timings for SA are significantly worse.  I suspect the "mmap reread" timing may just be a statistical glitch, but you never know...

R.