Subject: Re: New hash algorithm - FNV - for use in the kernel
To: Darren Reed <darrenr@reed.wattle.id.au>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 11/27/2001 20:20:42
On Wed, Nov 28, 2001 at 02:50:32PM +1100, Darren Reed wrote:

 > Luke, to take the devil's advocate appproach, it would seem that only
 > old cpu's don't have hardware multiply (I've no idea about new cut down
 > CPU's like ARM/MIPS and I'm sure someone will correct me if I've over-
 > generalized) and while we shouldn't penalise old architectures, atj

Multiply is fairly expensive on MIPS processors (several insns to set up,
and several cycles for it to complete), even modern ones.  There are other
modern processors, like the SuperH, which also do not have any hardware
multiply.

Here's assembler output for both the fnv_32_buf hash and the Perl string
hash, generated on a MIPS (-mips1 vs -mips3 didn't really make any
difference in the code):

        .text
        .align  2
        .globl  fnv32
        .type    fnv32,@function
        .ent    fnv32
fnv32:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, extra= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .cpload $25
        .set    reorder
        li      $6,33554432                     # 0x2000000
        ori     $6,$6,0x23
        .set    noreorder
        .set    nomacro
        beq     $5,$0,$L25
        addu    $7,$5,-1
        .set    macro
        .set    reorder

        li      $5,-1                   # 0xffffffff
$L24:   
        sll     $2,$6,15
        addu    $2,$2,$6
        sll     $2,$2,2
        subu    $2,$2,$6
        sll     $2,$2,3
        addu    $2,$2,$6
        sll     $2,$2,2  
        addu    $2,$2,$6
        sll     $2,$2,2
        subu    $6,$2,$6
        lbu     $3,0($4)
        addu    $4,$4,1
        addu    $7,$7,-1
        .set    noreorder
        .set    nomacro
        bne     $7,$5,$L24
        xor     $6,$6,$3
        .set    macro
        .set    reorder

$L25:
        .set    noreorder
        .set    nomacro
        j       $31
        move    $2,$6
        .set    macro
        .set    reorder

        .end    fnv32



        .align  2
        .globl  perlhash
        .type    perlhash,@function
        .ent    perlhash
perlhash:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, extra= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .cpload $25
        .set    reorder
        move    $6,$0
        .set    noreorder
        .set    nomacro
        beq     $5,$0,$L29
        addu    $7,$5,-1 
        .set    macro  
        .set    reorder

        li      $5,-1                   # 0xffffffff
$L30:   
        lbu     $3,0($4)
        addu    $4,$4,1
        addu    $7,$7,-1 
        sll     $2,$6,5
        addu    $2,$2,$6
        .set    noreorder
        .set    nomacro
        bne     $7,$5,$L30
        addu    $6,$2,$3
        .set    macro
        .set    reorder

$L29:
        srl     $2,$6,5
        .set    noreorder
        .set    nomacro
        j       $31
        addu    $2,$6,$2
        .set    macro
        .set    reorder

        .end    perlhash

So, as we can see, the multiply in each function was optimized to
shift/add (since they're both muliply-by-constant, it was pretty
easy for the compiler to do this).  However, the inner loop of
the Perl string hash routine is much smaller (the multiply is
by a much smaller number).  If the Perl hash function gives
distribution that is as good (or maybe even better?) as FNV for
a given application, then obviously the Perl hash routine is the
better choice.

...and who's to say it might not be a good idea to have multiple
hash functions around?  If one does better on certain types of
data (strings vs. opaque blobs of binary data, for example), then
we should use the best one for the application.

...but let's please try to NOT include hash functions that do
table lookups (especially when the table is too large or used
too infrequently to fit nicely in the L1) .. it'd be a real
shame to lose the benefit of a better hash function by polluting
your data cache.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>