Subject: Re: bcopy
To: None <port-alpha@NetBSD.ORG>
From: Chris G Demetriou <Chris_G_Demetriou@BALVENIE.PDL.CS.CMU.EDU>
List: port-alpha
Date: 08/12/1995 20:55:36
> This one performs within 5 percent of the
> OSF1 libc bcopy. I tested it very thoroughly for correctness.
I didn't bother testing for correctness (if it's fast enough, it
doesn't matter if it works right, right? 8-), but i did do some
performance comparisons, and i thought i'd share them with everybody
on the list.
All tests were 10240 8k non-overlapping bcopy()s, with varying
alignments, and tests on each type of CPU were done with identical
binaries.
The OSF/1 libc bcopy was tested under OSF/1, and the NetBSD bcopy()s
(and Trevor's) were tested under NetBSD.
With a 233MHz 21064A processor:
src & dst same alignment different alignment
OSF/1 libc bcopy ~0.219s ~0.346s
NetBSD kernel bcopy ~5.503s*1 ~5.503s*1
NetBSD libc 'C' bcopy ~0.561s ~7.34s
Trevor's asm bcopy ~0.223s ~0.350s
With a 266MHz 21164 processor:
src & dst same alignment different alignment
OSF/1 libc bcopy ~0.125s ~0.285s
NetBSD kernel bcopy *2 *2
NetBSD libc 'C' bcopy ~0.46s ~8.5s *3
Trevor's asm bcopy ~0.125s ~0.266s
*1 -- very stable; i never saw more than 0.005s difference _any_ of the
runs with this code
*2 -- weird. very weird. in general:
Time per 10240 copies ~= 6.4s + (srcoff + (7 - dstoff)) * 0.19s
*3 -- highly variable; some cases were repeatably as much as a second
more or less than this, and the machine was otherwise idle.
consistently longer than the 21064A, though!
it's interesting to note that:
(1) the NetBSD libc 'C' and NetBSD kernel bcopy, in this set
of tests, seems to tickle the 21164 in a less-than-friendly
way. I don't understand why this is.
(2) Trevor's bcopy() appears to do _better_ on the 21164 than
the OSF/1 bcopy -- but not by much. 8-)
My hat's off to Trevor, for a job well done. Now if only somebody
would do the same thing for the division and modulus routines...
8-)
chris