Subject: Re: more Q&D results [hash for use in the kernel]
To: ozan s. yigit <oz@zonzorp.canada.sun.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-perform
Date: 11/28/2001 11:26:01
On Wed, Nov 28, 2001 at 01:55:30PM -0500, ozan s. yigit wrote:

 > i added the "perl" hash function that was posted here; it is a variant
 > of the bernstein's k=33 hash, and the multiplication can be eliminated.

Indeed, GCC does eliminate the multiplication in the targets I checked.

 >                             unused  coll          avg     st     lookup
 > function 	              buckets  free    Rn   chain   dev    time
 > corbett/byacc               13.83%  14.02%  1.334  2.3   1.396   0.097
 > perl                        13.22%  14.16%  1.322  2.3   1.372   0.132
 > fowler/noll/vo              14.20%  14.21%  1.347  2.3   1.424   0.159

So, given that the perl hash function seem at least as good as fnv,
and is compiled into much shorter code, I would say that it's an
obvious candidate for a string_hash() and binary_hash() in the kernel.

While the byacc hash is very fast, it comes with a caveat; it needs to
know the hash table size in the inner loop, so if you have a table that
is dynamically-sized, it's not a good candidate (it uses the table size
as a modulo in the inner loop .. although, assuming a power-of-two table,
this can be reduced to a bitmask ... and is in our byacc sources).

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>