Subject: Re: kern/2572: kernel could use memcpy/memmove/memset to its own advantage
To: None <jtc@slave.cygnus.com>
From: John M Vinopal <banshee@gabriella.abattoir.com>
List: port-i386
Date: 08/23/1996 14:24:25
I took the advice of this bug report and compiled a kernel with bcopy et al
#defined in /sys/sys/systm.h, I then added a module with wrappers for 
memcpy et al that simply called the equivalent bXXX function.  To compare,
I compiled a kernel normally, and compiled a kernel using pgcc the pentium
optimizing c compiler based on gcc2.7.2.  All were compiled at -O2 using
my own savory blend of kernel options including: DIAGNOSTIC and DUMMY_NOPS.
My hardware is an ASUS T2P4, P133, 60ns ram, 512k burst cache.

pgcc produced slightly smaller code (11k smaller).  memcpy (excluding the
memcpy->bcopy module) only very slightly so.

I then ran lmbench a few times with each kernel on a totally dead silent
system: no ether, no processes like named or routed.

The results of lmbench indicate that there is little improvement to be
had using these techniques.  A kernel compiled with pgcc using memcpy
would certainly be smaller than a normal kernel but would not run 
substancially faster.  Switching to memcpy over bcopy is not likely to
produce results commensurate with the editting effort.

------

cc / memcpy	711493 Aug 23 02:57 ../GABRIELLA-MEM/netbsd*
pgcc		699144 Aug 23 02:27 ../GABRIELLA-PGCC/netbsd*
cc		711432 Aug 23 02:22 ../GABRIELLA/netbsd*


                L M B E N C H  1 . 0   S U M M A R Y
                ------------------------------------

                  Comparison to best of the breed
                  -------------------------------

		(Best numbers are starred, i.e., *123)


        Processor, Processes - factor slower than the best
        --------------------------------------------------
Host                 OS  Mhz    Null    Null  Simple /bin/sh Mmap 2-proc 8-proc
                             Syscall Process Process Process  lat  ctxsw  ctxsw
--------- ------------- ---- ------- ------- ------- ------- ---- ------ ------
netbsd-a  NetBSD 1.2_BE  133     1.2     1.0  *12.3K  *24.7K  1.0    1.1    *46
netbsd-a. NetBSD 1.2_BE  133     1.2     1.0     1.0     1.0  1.0    1.1    1.1
netbsd-me NetBSD 1.2_BE  133     1.2     1.0     1.0     1.0  1.0    1.0    1.1
netbsd-me NetBSD 1.2_BE  133      *5     1.0     1.0     1.0 *129    1.1    1.1
netbsd-me NetBSD 1.2_BE  133      *5     1.0     1.0     1.0 *129    1.1    1.0
netbsd-pg NetBSD 1.2_BE  133     1.2   *2.7K     1.0     1.0  1.0    *30    1.0
netbsd-pg NetBSD 1.2_BE  133      *5     1.0     1.0     1.0  1.0    1.0    1.0
netbsd-pg NetBSD 1.2_BE  133     1.2     1.0     1.0     1.0  1.0    *30    1.0

        *Local* Communication latencies - factor slower than the best
        -------------------------------------------------------------
Host                 OS  Pipe       UDP    RPC/     TCP    RPC/
                                            UDP             TCP
--------- ------------- ------- ------- ------- ------- -------
netbsd-a  NetBSD 1.2_BE     1.0     1.0     1.0     1.1     1.0
netbsd-a. NetBSD 1.2_BE     1.1    *228     1.1     1.1     1.1
netbsd-me NetBSD 1.2_BE    *115     1.0     1.0     1.0     1.0
netbsd-me NetBSD 1.2_BE     1.0     1.0     1.0     1.0     1.0
netbsd-me NetBSD 1.2_BE     1.0     1.0     1.1     1.1     1.1
netbsd-pg NetBSD 1.2_BE     1.1     1.0    *367    *300    *488
netbsd-pg NetBSD 1.2_BE     1.0     1.0     1.0     1.1     1.0
netbsd-pg NetBSD 1.2_BE     1.0     1.0     1.1     1.1     1.0

        *Local* Communication bandwidths - percentage of the best
        ---------------------------------------------------------
Host                 OS Pipe  TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                                  reread reread (libc) (hand) read write
--------- ------------- ---- ---- ------ ------ ------ ------ ---- -----
netbsd-a  NetBSD 1.2_BE  99%  99%    99%    99%    98%    98%  99%   *84
netbsd-a. NetBSD 1.2_BE  99%  98%     *3    98%    98%    99%  99%   *84
netbsd-me NetBSD 1.2_BE  96%  99%    98%    98%    99%    98%  *75   99%
netbsd-me NetBSD 1.2_BE  98%  98%     *3    98%    99%    *37  99%   *84
netbsd-me NetBSD 1.2_BE  *19  97%    99%    97%    *40    97%  99%   99%
netbsd-pg NetBSD 1.2_BE  99%  98%    98%    *53    99%    98%  99%   99%
netbsd-pg NetBSD 1.2_BE  99%  99%    99%    98%    99%    99%  99%   99%
netbsd-pg NetBSD 1.2_BE  *19  *11    99%    *53    99%    99%  99%   99%

            Memory latencies in nanoseconds - factor slower than the best
		    (WARNING - may not be correct, check graphs)
            -------------------------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    Guesses
--------- -------------   ---  ----   ----    --------    -------
netbsd-a  NetBSD 1.2_BE   133    *7    *81        *181
netbsd-a. NetBSD 1.2_BE   133    *7    1.2        *181
netbsd-me NetBSD 1.2_BE   133    *7    1.2        *181
netbsd-me NetBSD 1.2_BE   133    *7    1.1        *181
netbsd-me NetBSD 1.2_BE   133    *7    *81        *181
netbsd-pg NetBSD 1.2_BE   133    *7    1.2        *181
netbsd-pg NetBSD 1.2_BE   133    *7    1.1        *181
netbsd-pg NetBSD 1.2_BE   133    *7    1.1        *181