tech-toolchain: Re: gcc4 misoptimization

Subject: Re: gcc4 misoptimization
To: Alan Barrett <apb@cequrux.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: tech-toolchain
Date: 07/31/2006 11:15:44

On Sat, 2006-07-29 at 19:05, Alan Barrett wrote:
> [ Re compiler evaluating this using an extended precision:
> 		/* all variables are float */
>                 x += TWO52[s];
> 		x -= TWO52[s];
> ]
> On Thu, 27 Jul 2006, Martin Husemann wrote:
> > I think the compiler is allowed to do this optimization, and the volatile
> > temporary you added is the right way to instruct it not to do so.
> 
> What part of the C99 standard gives the compiler permission to do
> that?  Section 6.5 paragraph 8 seems to apply only to expressions, so
> operations in different statements (separated by a sequence point) would
> not appear to be covered there.  Section 5.2.4.2.2 paragraph 8 seems
> unclear to me, but perhaps it can be read as allowing the compiler to
> keep extended precision results across sequence points.
> 
> My experiments with gcc-4.1.2 in NetBSD/i386 seem to show that simply
> casting the intermediate result to (float) is sufficient to prevent the
> unwanted behaviour:
> 
>                 x = (float)(x + TWO52[s]) - TWO52[s];

It's not quite as simple as that (see my previous reply).  Essentially
the problem here is that the compiler can't adhere to the user's
expectations without making the code so bad as to be unacceptable.

If you look at the most common 'bugs' reported on GCC
(http://gcc.gnu.org/bugzilla/duplicates.cgi), you'll see that this one
comes out at number 4 (bug 323).  What's more, it's the only one in the
top ten that isn't closed.

Your attempt to work around the problem using a cast isn't guaranteed to
work in future releases of the compiler (the cast is really a NOP); the
volatile suggestion is slightly better, but it will slow things down on
every CPU.

The safest solution is probably to compile this one file with
-ffloat-store, which is the way gcc recommends working around this
problem.

R.