Subject: Request for testers.
To: Risc BSD mailing list <port-arm32@NetBSD.ORG>
From: Chris Gilbert <cg110@york.ac.uk>
List: port-arm32
Date: 04/17/1997 18:36:45
Hi,

After some recent work I've replaced some of the libkern routines with asm
versions.  Thanks to olly betts for providing some of these.  There's
still some to go in, but I really need is for people to be able to say if
the kernel is stable with these changes.  These functions include new
string functions, which I'm planning to use for the libc string routines.

I've compilied up a kernel using the source from last saturday in the sys
tar file, and using the VOYAGER config file, (the only thing missing is
the cunama scsi support, forgot to get the .o file)

If anyone's got the time to check it the kernel is available at:
http://www.plan9.cs.york.ac.uk/~cg110/voyager-kern.gz

I'll send the files onto mark sometime soon to add to the source tree.

Any problems, comments etc let me know.

I hope that there should be some speed up in booting (although I can't be
sure).

The alterations and new files are:-

Files tested, and known to work:-
div.S		Mark did this file originally, not altered in anyway.

New files that are working and correct:-
__main.S	an emtpy procedure, but made into about 5 lines when
		compiled
ffs.S		find first set bit, fixed time routine used.
		This routine does not loop, and uses conditional
		execution.  Faster in cases where the first set bit
		is the 5th or higher place.
htonl.S		Should be faster and smaller than compiled version
htons.S		Should be faster than compiled version
imax.S		All min/max routines now take 3 lines of asm, rather than
		the 6 lines of compiled versions. 
imin.S
lmax.S
lmin.S
max.S
min.S
ntohl.s		as htonl.s
ntohs.S		as htons.S
ulmax.S
ulmin.S

Files that are in and seem to work ok:-

bzero.S		Faster routine, fills in blocks of 64, 16, 4, or 1 byte at
		a time, may be expanded to do 256 bytes, or bigger...
skpc.S		taken the compiled version and altered for speed, less
		lines.
strcmp.S
strlen.S

I have more string routines (thanks Olly) but haven't had time to thoughly
test them.  All the above routines have been tested and appear to be ok. 

I aim to eventually add these routines into libc.  However there should be
some speed up already.  I've done my best to test all these routines. 

Any feedback on whether routines are faster would be gratefully received,
I'm only able to test them on a my SA and it's hard to tell.

I will eventually (next week some time) release these files for people to
add to their own kernels, and for Mark to add to the source tree.

If there are any routines people have a particular desire to have faster
version of let me know.  When finished with the string funs, and libkern,
I'm going to start on the in_cksum routine.

Chris Gilbert