Subject: Re: Some kernel profiling on a 5000/133 with -current
To: port-pmax <port-pmax@netbsd.org>
From: Ken Wellsch <kwellsch@tampabay.rr.com>
List: port-pmax
Date: 08/31/2001 09:13:43
I was curious about the calling profile of memcpy(), as in to
better understand what is calling it, not how it does its job.

So rather than struggle with something fancy that builds a
linked-list count data structure of length and alignment data,
I cheated and used function-call signatures and a profiling
kernel.

Since I've learned that empty function calls have negligible
overhead, this seemed like a quick and easy approach.  B^)

So I split which signature function was to be called, based
upon src/dst alignment (since I'm using the portable C version
of memcpy() I did not bother distinguishing src versus dst
alignments and I doubt it matters, but who knows)

I then called an appropriately named function that would tell me
to what power of 2 did I need to divide "length" by to make it
zero.  Thus my ranges below include the lower power, but exclude
the higher power-of-two at the top end of the range.

(i.e. I kept dividing "length" by two til it became zero B^)

The profiling sample was taken over last night, running six
bonnie's in sequence, with file size of 500Mb each.  Each run
of bonnie took a tad over 52 minutes of real-time.

Here is the distribution summary, for what it's worth (how many
times was memcpy() called with a given aligned/unaligned src/dst
and a given range/size for the "length" of data to copy):

      6007         aligned  [0 - 2)
         0         aligned  [2 - 4)
   1842263  20.7%  aligned  [4 - 8)
   4301667  48.4%  aligned  [8 - 16)
    411176   4.6%  aligned  [16 - 32)
       165         aligned  [32 - 64)
       202         aligned  [64 - 128)
        53         aligned  [128 - 256)
        23         aligned  [256 - 512)
         0         aligned  [512 - 1024)
         7         aligned  [1024 - 2048)
       579         aligned  [2048 - 4096)
         8         aligned  [4096 - 8192)
   2334155  26.2%  aligned  [8192 - 16384)

   8896305  95.6%  aligned memcpy

     18017   4.4%  unaligned  [0 - 2)
         0         unaligned  [2 - 4)
    305123  73.8%  unaligned  [4 - 8)
     90311  21.8%  unaligned  [8 - 16)
        16         unaligned  [16 - 32)
         5         unaligned  [32 - 64)
         0         unaligned  [64 - 128)
         0         unaligned  [128 - 256)
         4         unaligned  [256 - 512)

    413476   4.4%  unaligned memcpy

   9309781         memcpy

Cheers,

-- Ken