tech-perform: Re: New hash algorithm - FNV

Subject: Re: New hash algorithm - FNV - for use in the kernel
To: None <thorpej@wasabisystems.com>
From: Andrea Cocito <acocito@lar.ieo.it>
List: tech-perform
Date: 11/28/2001 12:10:15

Jason wrote:
>by a much smaller number).  If the Perl hash function gives
>distribution that is as good (or maybe even better?) as FNV for
>a given application, then obviously the Perl hash routine is the
>better choice.

The following I wrote a life ago for the undernet's irc daemon
has been extensively tested and gives *very* good distribution of
hashes, it is not disturbed by the fact that the hashtable size
is a power of two (as long as is not multiple of HASHSHIFT).

static HASHREGS hash_nem(register char *n)
{
  register HASHREGS hash = 0;
  register unsigned char ch;
  while(ch = *n++)
        {
                hash+=hash_weight(ch);
                hash*=(long)HASHSHIFT;
                hash%=(long)HASHSIZE;
        }
  return hash;
}

Note that HASHSHIFT must be a prime bigger than a byte and the
only requirement about HASHSIZE is that it must not be multiple
of HASHSHIFT. hash_weight() was used only because the hash of
strings differing only for letter case has to remain the same
here.

Thus once set HASHSHIFT to 257 and HASHSIZE to the size of a word
I would give a try to something like:

unsigned long strhash(register unsigned char *n)
{
  register unsigned char ch;
  register unsigned long hash = 0;

  while (ch = *n++)
  {
    hash += ch;
    hash += (hash << 8);
  }

  return hash;
}

Two additions and a bit shift...

Regards,

Andrea

-- 
Andrea Cocito
Biocomputing research unit director
Experimental oncology department
Europen Institute of Oncology
Via Ripamonti 435, Milano - Italy
Tel. +39-02-57489857
Fax. +39-02-57489851