tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: KNF and the C preprocessor



On Mon, Dec 10, 2012 at 03:50:00PM -0500, Thor Lancelot Simon wrote:
> On Mon, Dec 10, 2012 at 02:28:28PM -0600, David Young wrote:
> > On Mon, Dec 10, 2012 at 07:37:14PM +0000, David Laight wrote:
> > 
> > > a) #define macros tend to get optimised better.
> > 
> > Better even than an __attribute__((always_inline)) function?

Consider the following code:

int ring[100];
#define ring_end (ring + 100)
int *ring_ptr;
int ring_wrap_count;

#define cmd(n) \
        if (__predict_true(ring_ptr < ring_end)) \
                *ring_ptr++ = n; \
        else { \
                ring_ptr = ring; \
                *ring_ptr++ = n; \
                ring_wrap_count++; \
        }

        

        for (;;) {
                if (__predict_false(...)) {
                        if (...) {
                                ....
                                cmd(1);
                                continue;
                        }
                        ...
                        cmd(2);
                        continue;
                }
                ...
        }

I want func() inlined twice so that there are only 2 conditional
branches and usually a conditional branch in cmd() back to the loop
top in each path.
So I need to stop the compiler tail merging the two parts of the
inside 'if'
There is nothing I can put inside an inline function version of cmd()
that will stop this happening.

In the #define version I can add things that stop the compiler
merging the code. Prizes for thinking what!
(Yes I could do the same in the outer code, but that happens quite
often and I'd much rather hide the hackery in one place.)

And yes, this is a real case from some code where I needed to minimise
the worst case path enough that the extra branch mattered!
The 'unusual' worst case of 'ring wrap' doesn't matter.

I've seen other cases where the code for #define is better than that
for an inline function. Possibly because an extra early optimisation
happens.

I know I've also had issues getting compilers to actually inline stuff.
gcc's __attribute__((always_inline)) helps - I've had to use it to
get static functions that are only called once reliably inlined.

> I'd like to submit that neither are a good thing, because human
> beings are demonstrably quite bad at deciding when things should
> be inlined, particularly in terms of the cache effects of excessive
> inline use.

Indeed - there are some horrid large #define macros lurking.
For some of them I can't imagine when they were benefitial.

There have been some places where apparantly innocuous #defined
have exploded out of all proprotion.
The worst I remember was the SYS vn_rele(), by the time the
original spl() functions had been replaced with lock functions,
and the locks had also become inlined the whole thing exploded.
 
> One reason why macros should die is that in the process, inappropriate
> and harmful excessive inlining of code that would perform better if
> it were called as subroutines would die.

That is true whether inline functions or #defines are used.

Are some computer science courses teaching about optimisations that
really haven't been true since the days of the m68k?

        David

-- 
David Laight: david%l8s.co.uk@localhost


Home | Main Index | Thread Index | Old Index