tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: radix tree implementation for quota ?



On Mon, Nov 29, 2010 at 10:44:11AM +0000, Sad Clouds wrote:
> As for calculating the hash, modern hardware has fast modulo
> instructions. A very simple test on Intel Atom D510 processor, seems
> to indicate a single 'interger % prime' is roughly equal to two
> integer operations.

No, divisions and modulo are still very slow. According the Intel
Architecture Guide, div has a latency of at least 56 cycles compared to
less than 20 cycles for all integer multiplication instructions.
Throughput is at best 1/4 as well. Please check your test case (and the
assembler output) carefully for not hitting the unsigned division
optimisation I mentioned earlier...

Joerg


Home | Main Index | Thread Index | Old Index