tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: KNF and the C preprocessor
On Mon, Dec 10, 2012 at 03:50:00PM -0500, Thor Lancelot Simon wrote:
> On Mon, Dec 10, 2012 at 02:28:28PM -0600, David Young wrote:
> > On Mon, Dec 10, 2012 at 07:37:14PM +0000, David Laight wrote:
> >
> > > a) #define macros tend to get optimised better.
> >
> > Better even than an __attribute__((always_inline)) function?
Consider the following code:
int ring[100];
#define ring_end (ring + 100)
int *ring_ptr;
int ring_wrap_count;
#define cmd(n) \
if (__predict_true(ring_ptr < ring_end)) \
*ring_ptr++ = n; \
else { \
ring_ptr = ring; \
*ring_ptr++ = n; \
ring_wrap_count++; \
}
for (;;) {
if (__predict_false(...)) {
if (...) {
....
cmd(1);
continue;
}
...
cmd(2);
continue;
}
...
}
I want func() inlined twice so that there are only 2 conditional
branches and usually a conditional branch in cmd() back to the loop
top in each path.
So I need to stop the compiler tail merging the two parts of the
inside 'if'
There is nothing I can put inside an inline function version of cmd()
that will stop this happening.
In the #define version I can add things that stop the compiler
merging the code. Prizes for thinking what!
(Yes I could do the same in the outer code, but that happens quite
often and I'd much rather hide the hackery in one place.)
And yes, this is a real case from some code where I needed to minimise
the worst case path enough that the extra branch mattered!
The 'unusual' worst case of 'ring wrap' doesn't matter.
I've seen other cases where the code for #define is better than that
for an inline function. Possibly because an extra early optimisation
happens.
I know I've also had issues getting compilers to actually inline stuff.
gcc's __attribute__((always_inline)) helps - I've had to use it to
get static functions that are only called once reliably inlined.
> I'd like to submit that neither are a good thing, because human
> beings are demonstrably quite bad at deciding when things should
> be inlined, particularly in terms of the cache effects of excessive
> inline use.
Indeed - there are some horrid large #define macros lurking.
For some of them I can't imagine when they were benefitial.
There have been some places where apparantly innocuous #defined
have exploded out of all proprotion.
The worst I remember was the SYS vn_rele(), by the time the
original spl() functions had been replaced with lock functions,
and the locks had also become inlined the whole thing exploded.
> One reason why macros should die is that in the process, inappropriate
> and harmful excessive inlining of code that would perform better if
> it were called as subroutines would die.
That is true whether inline functions or #defines are used.
Are some computer science courses teaching about optimisations that
really haven't been true since the days of the m68k?
David
--
David Laight: david%l8s.co.uk@localhost
Home |
Main Index |
Thread Index |
Old Index