[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: radix tree implementation for quota ?
On Mon, 29 Nov 2010 09:57:18 +0000
David Laight <david%l8s.co.uk@localhost> wrote:
> What about the sequence prime, prime*2, prime*3 etc.
> Unless you are going to search for a factor that works, no particular
> value is any better than any other if you allow pathaogical data
> to be selected.
Only that keys which are multiples of some even number occur much more
frequently. Yes you need to have a knowledge of what type of data you are
hashing, but in most cases hash tables work very well.
> Even searching for a factor may not help.
> I looked at the distribution for the elf symbol table in libc.
> The current 'first prime below the power of 2' was no different
> from the '2^n-1' value - except that it is a lot slower to compute.
Yes when hashing strings, power of 2 hash tables work just as well. With
integers you have to be more careful.
> I would always look for a deterministic algorythm. Hashing to linked
> lists is still o(n), with a hash table with 'h' entries it is just
> 'h' times faster, and then only if the linked lists are equal length.
> Trying to get hash chains of length 1 or 2 requires a lot of knowledge
> of the data in order to generate a suitable hash function.
> And then calculating the hash might be more expensive than a few
If you are hashing integer values, you can tune the hash table so that every
bucket will have no more than 1 entry. Usually you simply make hash table size
a prime number.
As for calculating the hash, modern hardware has fast modulo instructions. A
very simple test on Intel Atom D510 processor, seems to indicate a single
'interger % prime' is roughly equal to two integer operations.
Main Index |
Thread Index |