Subject: Re: New hash algorithm - FNV - for use in the kernel
To: None <email@example.com, firstname.lastname@example.org>
From: Simon Burge <email@example.com>
Date: 11/28/2001 15:44:26
Jason R Thorpe wrote:
> [ ... ] If the Perl hash function gives
> distribution that is as good (or maybe even better?) as FNV for
> a given application, then obviously the Perl hash routine is the
> better choice.
Running a couple of the hashes mentioned here of two different sorts
of data revail some interesting results. The tests use 65536 buckets
(hashfunction % 64k) and just count the number of used buckets and the
maximum count of each bucket.
The hash functions are:
the current "add all the bytes together" function.
the hash function Lennart Augustsson posted to these lists.
the crc hash function from
the perl hash function Tom Spindler posted to these lists.
First is 92378 file names, all starting with "/home/simonb/src":
Second is 32MB of output from /dev/urandom, hashed as 32byte
From these two test cases, both FNV and perl look to be best. Given
6 "1"s in the FNV multiplacation constant and just 2 "1"s in the perl
multiplacation constant, I'd say the perl hash function is looking good.
It was interesting to note the "lennart" hash function didn't fare too
well for binary data, and the "crc" hash function didn't do too well
over string data.
Obviously more testing should be done first though...
Simon Burge <firstname.lastname@example.org>
NetBSD CDs, Support and Service: http://www.wasabisystems.com/