Subject: Re: New hash algorithm - FNV - for use in the kernel
To: David Laight <David.Laight@btinternet.com>
From: Luke Mewburn <lukem@wasabisystems.com>
List: tech-kern
Date: 11/27/2001 22:17:13
On Tue, Nov 27, 2001 at 09:21:36AM -0000, David Laight wrote:
> > 1) vmstat -h nfsnodehash:
> > total used util num average maximum
> > hash table buckets buckets % items chain chain
> > ----------------------------------------------------------------------
> >
> > with existing nfs_hash():
> > nfsnodehash 65536 1000 1.53 34688 34 105
>
> The 1000 used buckets looks too much like a non-random number!
> Anyone know whether this is true?
I've run a bunch of other tests. It's just a random sample. Here's
the non FNV results from another machine on the same data set:
nfsnodehash 8192 770 9.40 4077 5 17
Take a look at the existing implementation of nfs_hash(); it does not
appear to me that there's a deliberate part of the code that's trying
to specifically limit the hash distribution to around 1000. The code
currently just sums up all the bytes in the filehandle and returns
that as the hash value.
Also, I'm interested in some real-world tests of proposed improved
implementations. It's all well and good for people to fall into
the trap of making statements like ``oh, if you make this [non
trivial] change to the algorithm it will have this benefit'', but
unless an implementation is provided to be actually tested and
proven, the statements are generally all academic.
(Note; I'm not singling you out here; this is a general personal
observation on many such discussions that occur about proposed
enhancements to NetBSD).