Subject: Re: Request for testers.
To: None <cg110@york.ac.uk>
From: None <olly@muscat.co.uk>
List: port-arm32
Date: 04/21/1997 14:10:48
In article <Pine.SGI.3.95L.970417182500.25801B-100000@tower.york.ac.uk>,
	Chris Gilbert wrote:
>ffs.S		find first set bit, fixed time routine used.
>		This routine does not loop, and uses conditional
>		execution.  Faster in cases where the first set bit
>		is the 5th or higher place.

Here's a version of ffs David Seal posted to comp.sys.acorn.tech (csa.tech
and armltd.uucp eh?  Ah, those were the days) sometime in Feb 1994:

In article <32975@armltd.uucp> dseal@armltd.co.uk (David Seal) writes:
>I produced a better variant of the same theme last night:
>
>  ; Operand in R0, register R1 is free, R2 addresses a byte table 'Table'
>
>    RSB     R1,R0,#0            ;Standard trick to isolate bottom bit in R1,
>    AND     R1,R1,R0            ; or produce zero in R1 if R0 = 0.
>
>    ORR     R1,R1,R1,LSL #4     ;If R1=X with 0 or 1 bits set, R0 = X * &11
>    ORR     R1,R1,R1,LSL #6     ;R0 = X * &451
>    RSB     R1,R1,R1,LSL #16    ;R1 = X * &0450FBAF
>
>    LDRB    R1,[R2,R1,LSR #26]  ;Look up table entry indexed by top 6 bits
>                                ; of R1
>
>  ; Result in R1

This needs a 64 byte table, which I don't seem to have a copy of.  It
shouldn't be very hard to generate though (just run enough values through).
IIRC there may be an unused entry in the table, so simply trying values
until it's full might take quite a while.

Including setting up R2, this is 6 data ops + one load and doesn't need to
save any registers.

Let me know if you want me to tidy this up into a working source file.

>If there are any routines people have a particular desire to have faster
>version of let me know.  When finished with the string funs, and libkern,
>I'm going to start on the in_cksum routine.

The obvious candidate is memcpy/memove, but I wanted to get some stats on
what usage it gets before putting lots of work into it.  However, I've yet
to get a kernel source tree on my Risc PC.  Hopefully sometime this week...
If someone feels keen and wants to generate me some, what I wanted was the
relative frequencies of each ( src & 3, dest & 3, length ) triplet.

Cheers,
Olly
-- 
Usenet is not for answering questions but for questioning answers.