Subject: Re: irq handling patch
To: Chris Gilbert <chris@paradox.demon.co.uk>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm32
Date: 04/03/2001 10:19:09
> > Is the algorithm that David Seal devised?  If so, I think it would be
> > right to credit him with it in a comment.
> 
> Yes i think it might be, but I've no idea how to credit him, if I put 
> copyright as him then I certainly can't submit it.  My other problem was I 
> wasn't too sure it was his, a version of the ffs stuff I've got from way back 
> thinks it's his (or finding the number that can be done in 3 arm instr is 
> probably his, the algorithm is probably fairly generic)  Does anyone have an 
> email or contact details for him?

He works here.  I'll ask him.  I think fair credit in this case would be 
to add a comment before the code in question stating its origin -- 
something along the lines of

  /* This algorithm for FFS was devised by D. Seal and posted to
   * comp.sys.arm on 16 Feb 1994. 
   */
 
I've just found the original posting in my copious archives ;-)  I've 
appended the relevant part of it below for the curious.

R.

From: dseal@armltd.co.uk (David Seal)
Subject: Re: The next ARM-coders Challange
Date: 16 Feb 94 12:13:12 GMT
Organization: A.R.M. Ltd, Swaffham Bulbeck, Cambs, UK

In article <32918@armltd.uucp> I wrote:

>  ; Operand in R0, register R1 is free, R2 addresses a byte table 'Table'
>  ; (defined below)
>
>    RSB     R1,R0,#0            ;Standard trick to isolate bottom bit in R1,
>    AND     R1,R1,R0            ; or produce zero in R1 if R0 = 0.
>
>    ORR     R0,R1,R1,LSL #7     ;If R1=X with 0 or 1 bits set, R0 = X * &81
>    ORR     R0,R0,R0,LSL #14    ;R0 = X * &204081
>    ORR     R1,R1,R1,LSL #8     ;R1 = X * &101
>    ORR     R1,R1,R1,LSL #16    ;R1 = X * &1010101
>    ORR     R1,R1,R0,LSL #7     ;So R1 is now original R1 * &11214181
>
>    LDRB    R1,[R2,R1,LSR #24]  ;Look up table entry indexed by top 8 bits
>                                ; of R1
>
>  ; Result in R1
>
>'Table' is a 193-byte table with some entries defined as follows; all 
other
>entries don't matter. ...

I produced a better variant of the same theme last night:

  ; Operand in R0, register R1 is free, R2 addresses a byte table 'Table'

    RSB     R1,R0,#0            ;Standard trick to isolate bottom bit in 
R1,
    AND     R1,R1,R0            ; or produce zero in R1 if R0 = 0.

    ORR     R1,R1,R1,LSL #4     ;If R1=X with 0 or 1 bits set, R0 = X * &11
    ORR     R1,R1,R1,LSL #6     ;R0 = X * &451
    RSB     R1,R1,R1,LSL #16    ;R1 = X * &0450FBAF

    LDRB    R1,[R2,R1,LSR #26]  ;Look up table entry indexed by top 6 bits
                                ; of R1

  ; Result in R1

Timing: 6 S-cycles, 1 N-cycle, 1 I-cycle. The overhead of addressing the
byte table should be put outside the critical loop. Alternatively, put the
table close enough to allow it to be addressed with a single ADR
instruction, increasing the cycle count by 1 S-cycle.

Again, the key to this is the fact that the top 6 bits of X * &0450FBAF are
different for each of X = 0, 1, 2, 4, 8, ..., 2^31. So all you need to do 
is
fill in the corresponding entries of the 64 byte table 'Table' with the
correct result values. In fact, you can get away with a 63 byte table, 
since
the index &3F is never used.