Subject: Re: diff to speed up fdalloc using two-level bitmaps
To: Jason Thorpe <thorpej@wasabisystems.com>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-perform
Date: 10/30/2003 01:04:49
On Wed, Oct 29, 2003 at 06:37:10AM -0800, Jason Thorpe wrote:
>
> On Wednesday, October 29, 2003, at 03:48 AM, Bang Jun-Young wrote:
>
> >BTW, how about making ffs(9) inline as well? Calling overhead seems
> >to be quite high on i386...
>
> GCC will already inline ffs() if the CPU back-end provides the
> appropriate pattern. The right answer, if GCC is not doing it on i386,
> would be to add that pattern to i386.md.
I looked further and found that ffs() was properly inlined in the
kernel:
c01c02a4 <fdalloc>:
[snip]
c01c0348: 83 fa ff cmp $0xffffffff,%edx
c01c034b: 0f 84 5e 01 00 00 je c01c04af <fdalloc+0x20b>
c01c0351: f7 d2 not %edx
c01c0353: c1 e3 05 shl $0x5,%ebx
c01c0356: 31 c0 xor %eax,%eax
c01c0358: 0f bc d2 bsf %edx,%edx
c01c035b: 0f 94 c0 sete %al
c01c035e: f7 d8 neg %eax
So it's clear that a little speedup with inlined ffz() as shown in
provos' graph was due to incomplete implementation, isn't it?
Jun-Young
--
Bang Jun-Young <junyoung@NetBSD.org>