Subject: Re: more Q&D results [hash for use in the kernel]
To: Simon Burge <simonb@wasabisystems.com>
From: Andrea Cocito <acocito@lar.ieo.it>
List: tech-kern
Date: 11/29/2001 18:24:44
Simon Burge Said:
>I've done some more playing, and generated some filehandles (look at
>getfh(2) for a handy syscall). Here's the outputs, sorted by time per
>hash:
Would you post a link to the source for this test and the data for the test
suite you used too ?
This is what I get with the previous code adding my function
(A PowerPC G4 @466 Mhz, compiled with -O2):
[nemesi:~/Desktop/hashtest] blackye% cat filenames /usr/share/dict/words | ./hash s 32769
total items: 353407
hash dumb used 16990 max 625 time 0.291sec
hash fnv used 32769 max 28 time 0.409sec
hash lennart used 32768 max 28 time 0.385sec
hash crc used 32769 max 35 time 0.394sec
hash perl used 32084 max 27 time 0.346sec
hash mouse used 32720 max 79 time 0.423sec
hash nemesi used 32768 max 27 time 0.343sec
("filenames" the output of "find /" on my machine)
Function "nemesi" is what I posted previously:
unsigned int
nemhash(unsigned char *str, size_t len)
{
unsigned int hash;
hash = 0;
while (len-- > 0) {
hash += *str++;
hash += (hash << 8);
}
return (hash);
}
I would also note that it is not the maximum lengts of the
worst chain what matters but the sum of squares of the lengths.
I mean: I consider more correct to evaluate the "good spreading"
of the hash function something like:
/* long loops, nodes, perfect, tmp; */
loops = 0;
for (i = 0; i < hashsize; i++) {
loops += (buckets[i] * buckets[i]);
}
tmp = items % hashsize;
perfect = (items / hashsize) * (items / hashsize) * (hashsize - tmp)
+ (1 + (items / hashsize)) * (1 + (items / hashsize)) * (tmp);
/* double score */
score = ((double) perfect) / ((double) loops);
printf("hash %-10s score %e time %ld.%03ldsec\n",
name, score, (long)t3.tv_sec, (long)t3.tv_usec / 1000);
Where you get as score the ratio between the number of loops across
the linked lists to lookup once each element you hashed with this
function respect to what you would get with a "perfect" hash function
that stores exactly either "n" or "n+1" elements in each bucket.
(You won't get a score = 1 even with an ideal random generator,
the bigger the score is the better the function is)
With this test this is what I get (this time compiled with -O3
and running as root to have more data...):
[nemesi:~/Desktop/hashtest] blackye% cat filenames /usr/share/dict/words | ./hash s 32769
total items: 353407
hash dumb score 5.318160e-02 time 0.289sec
hash fnv score 9.174211e-01 time 0.408sec
hash lennart score 9.160142e-01 time 0.396sec
hash crc score 9.101253e-01 time 0.377sec
hash perl score 8.923864e-01 time 0.355sec
hash mouse score 6.621882e-01 time 0.422sec
hash nemesi score 9.162064e-01 time 0.357sec
[nemesi:~/Desktop/hashtest] blackye% cat filenames /usr/share/dict/words | ./hash s 65536
total items: 353407
hash dumb score 2.677079e-02 time 0.289sec
hash fnv score 8.507784e-01 time 0.420sec
hash lennart score 8.496121e-01 time 0.404sec
hash crc score 4.150457e-01 time 0.382sec
hash perl score 8.514676e-01 time 0.367sec
hash mouse score 2.490107e-01 time 0.426sec
hash nemesi score 7.981116e-01 time 0.365sec
[nemesi:~/Desktop/hashtest] blackye%
Looking at the asm cocan say that the perl hash tends to be slightly
faster with long sequences and on moderm machines (does one
multiplication by constant inside the loop and one more operation
outside the loop) while mine tends to be slightly faster on short
strings and on older archictectures (causes one more pipeline stall
in the inner loop, but does no multiplication).
The distribution is more or less as good for each as for the other.
Would you send me a link to the data suite you used for testing ?
I can test with several different platforms here (PPC G3 and G4,
Alpha 21164, alpha ev7, SGI, Intel, mc68k and others).
Ciao,
Andrea aka Nemesi