Subject: Re: bzero
To: None <port-powerpc@netbsd.org>
From: Martin J. Laubach <mjl@emsi.priv.at>
List: port-powerpc
Date: 11/04/2001 04:29:59
  So I've been entertaining myself by writing several bzero
implementations in assembler. Before I place this into libc
and libkern, could you please give it a good beating? I have
placed it for download at

  http://rhubarb.emsi.priv.at/download/bzero.tar.gz

  Please unpack the archive, then run ./compile and tell me
the results. On my G4, this looks like

	celery:198 [/tmp] % ./compile
	Compiling
	Running regression tests
	ok algorithm 0 (Original C)
	ok algorithm 1 (Simple byte)
	ok algorithm 2 (Simple word)
	ok algorithm 3 (Cache block)
	ok algorithm 4 (Cache block 2)
	Running speed tests
	Running algorithm 0 (Original C): run time: 13979 msec
	Running algorithm 1 (Simple byte): run time: 5687 msec
	Running algorithm 2 (Simple word): run time: 1491 msec
	Running algorithm 3 (Cache block): run time: 832 msec
	Running algorithm 4 (Cache block 2): run time: 822 msec
	Running algorithm 5 (libc): run time: 852 msec

	[Note: I'm obviously already running a faster libc]

  As you can see, the difference is substancial -- but I'm not
sure it will be that way on all PPC variants.

  Also, PPC savvy people are invited to brick me because of
the assembler style, feel free.

	mjl