tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: radix tree implementation for quota ?



On Sun, Nov 28, 2010 at 08:12:34PM -0500, der Mouse wrote:
> I think the point of a prime table size is that with a hash key that's
> an integral type, as here, then you can often use a prime table size
> and get decent reults from the identity hash function - or, to think of
> it another way, use value%size as your hash function and use the
> identity mapping from hash values to table buckets.  A prime hash table
> size gives about as good hashing under this sort of scheme as anything,
> for most key distributions.

No. The point of a prime module is a simple argument from group theory.
If you have keys {k_1, ..., k_i} that are of the form i * c with gcd(c,n)=1,
you ensure that you get an even distribution and no collision if i < n.
The argument falls apart as soon as you move to a non-linear hash
function. Those tend to be more useful for most applications because linear
hash functions tend to throw away valuable bits and are very susceptible
to funnel issues that way.

> Yes, reduction modulo a prime is generally a substantially slower
> operation than masking off all but the last N bits.  However, if it's
> faster than the hash function you were going to use before doing the
> masking, which is not infrequently the case, it's an overall win.

This isn't necessarily true either. E.g. if it is a compile time
constant and the hash function returns unsigned and the architecture
supports multiplication with large enough output, the modulo operation
will be replaced by two multiplications and a shift. Give that price, it
is often a better idea to use a smarter hash function.

Joerg


Home | Main Index | Thread Index | Old Index