Subject: Re: bzero
To: None <port-powerpc@netbsd.org>
From: Martin J. Laubach <mjl@emsi.priv.at>
List: port-powerpc
Date: 11/04/2001 04:29:59
So I've been entertaining myself by writing several bzero
implementations in assembler. Before I place this into libc
and libkern, could you please give it a good beating? I have
placed it for download at
http://rhubarb.emsi.priv.at/download/bzero.tar.gz
Please unpack the archive, then run ./compile and tell me
the results. On my G4, this looks like
celery:198 [/tmp] % ./compile
Compiling
Running regression tests
ok algorithm 0 (Original C)
ok algorithm 1 (Simple byte)
ok algorithm 2 (Simple word)
ok algorithm 3 (Cache block)
ok algorithm 4 (Cache block 2)
Running speed tests
Running algorithm 0 (Original C): run time: 13979 msec
Running algorithm 1 (Simple byte): run time: 5687 msec
Running algorithm 2 (Simple word): run time: 1491 msec
Running algorithm 3 (Cache block): run time: 832 msec
Running algorithm 4 (Cache block 2): run time: 822 msec
Running algorithm 5 (libc): run time: 852 msec
[Note: I'm obviously already running a faster libc]
As you can see, the difference is substancial -- but I'm not
sure it will be that way on all PPC variants.
Also, PPC savvy people are invited to brick me because of
the assembler style, feel free.
mjl