tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

RE: aligned_alloc c11 function

> Then talking about portability here I guess you mean machine portability?
> I worry about another portability as well, compiler portability. My
> problem with ilog2 in bitops.h is that it uses __builtin_constant_p which
> is a GNU extension and thus not compiler portable. From a performance
> perspective there is also a problem if the compiler don't think it's a
> built in constant it will use fls32 in the if every time aligned_alloc
> runs. This is a higher performance penalty than the while loop that Robert
> suggested.
> (also see roberts mail)
> Due to the above considerations I think I will commit aligned_alloc.c with
> the while (without the preprocessor code).

That's fine.

Since you enquired about my portability assumptions, I'll continue with one
more email.

I think that a file that is going to be compiled for the NetBSD distribution
will work with a compiler that has no portability issues with sys/bitops.h.
(If sys/bitops.h doesn't work, then the chosen compiler is broken as a
NetBSD compiler. ilog2() already has the implicit guarantee that it's a
compile time constant for a compile-time expression; that's part of the
source level "API".)  

If I want code to be portable across operating systems, I'm much more
radical. I don't use any of the <*.h> header files in the source file
itself; everything has to have wrappers. This includes all system calls, and
even the supposedly portable C-89 etc header files. This is because I never
use #if in ".c" files unless it's utterly unavoidable (sort of the same as
my attitude towards goto, but from a different set of considerations), and
in order to control namespace issues. (I liked Mouse's recent wisecrack
about that -- was that on this thread? Or am I off by one again?)

If I had been thinking about writing aligned_alloc() as a portable
cross-system, compiler-independent function, I would have defined something
like the following macros (possibly in this file, possibly in a
subsystem-specific header file):

/* LG2_2() could be also be ((x) & 2) >> 1; but we write it consistently
with the other terms */
#define LG2_2(x)	((x) &                  2ull ?  1
: 0)	
#define LG2_4(x)	((x) &                0xCull ?  2 + LG2_2((x) >> 2)
: LG2_2(x))
#define LG2_8(x)	((x) &               0xF0ull ?  4 + LG2_4((x) >> 4)
: LG2_4(x))
#define LG2_16(x)	((x) &             0xFF00ull ?  8 + LG2_8((x) >> 8)
: LG2_8(x))
#define LG2_32(x)	((x) &         0xFFFF0000ull ? 16 + LG2_16((x) >>
16) : LG2_16(x))
#define LG2_64(x)	((x) & 0xFFFFFFFF00000000ull ? 32 + LG2_32((x) >>
32) : LG2_32(x))
#define ilog2(x)	((x) ? LG2_64(x) : -1)

(Actually, I might have used the bit complements of the above constants
especially for the largest masks, but this way makes it clear, since I think
nobody is actually going to be using this.) No gcc hints, no NetBSD include
files. The results are compile time constants as before.

I always avoid wasted motion at run time if I possibly can, because I think
wasted motion is a very bad habit to get into. A wasted clock cycle is lost
forever. I don't think that's a silly point at all.

Best regards,

Home | Main Index | Thread Index | Old Index