Subject: Re: diff to speed up fdalloc using two-level bitmaps
To: Lennart Augustsson <lennart@augustsson.net>
From: Niels Provos <provos@citi.umich.edu>
List: tech-perform
Date: 10/29/2003 01:59:13
On Tue, Oct 28, 2003 at 11:25:59AM -0500, Niels Provos wrote:
> It could need some fine tuning
> 
>  http://www.citi.umich.edu/u/provos/benchmark/netbsd-fdalloc-zoom.jpg
> 
> An inline assembly function for finding the first zero bit in a
> word would be a great thing to have across architectures.

The same benchmark with the following inline assembly

static __inline__ uint32_t ffz(uint32_t word)
{
        __asm__("bsfl %1,%0"
                :"=r" (word)
                :"r" (~word));
        return word;
}

looks much better:

  http://www.citi.umich.edu/u/provos/benchmark/netbsd-fdalloc-zoom.jpg

Now, it is faster than before in all cases.  It would be great if we
could have a

 <machine/bitops.h>

include file that would define similar inline assembly for all
architectures or the equivalent C code if there is no instruction
support.

Niels.