Subject: Re: gcc4 misoptimization
To: Alan Barrett <apb@cequrux.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: tech-toolchain
Date: 08/01/2006 16:02:36
On Tue, 2006-08-01 at 13:45, Alan Barrett wrote:
> On Mon, 31 Jul 2006, Richard Earnshaw wrote:
> > If you look at the most common 'bugs' reported on GCC
> > (http://gcc.gnu.org/bugzilla/duplicates.cgi), you'll see that this one
> > comes out at number 4 (bug 323).  What's more, it's the only one in the
> > top ten that isn't closed.
> 
> It's not clear to me that that's exactly the same issue, but it does
> seem to be at least closely related.  Comment number 70 on that bug
> report is revealing:  It seems to say that it is a real gcc bug that
> will probably never be fixed.
> 
Looking at the original code that was posted, which showed the add and
the subtract operations still being present, makes me fairly certain
that it is this issue.

> > Your attempt to work around the problem using a cast isn't guaranteed
> > to work in future releases of the compiler (the cast is really a NOP);
> > the volatile suggestion is slightly better, but it will slow things
> > down on every CPU.
> 
> I don't think the cast is a NOP (per the C99 standard, regardless of
> what gcc does).  Section 5.2.4.2.2 paragraph 8 of the C99 standard gives
> the implementation permission to evaluate floating point expressions in
> a format that's more accurate than one might expect, and with just "x =
> x + FOO - FOO" that permission would apply.  However, section 5.1.2.3
> paragraph 12 (example 4) says that casts and assignments must do the
> right thing even if the implementation is using what they call "wide
> registers", so "x = (float)(x + FOO) - FOO" is different.
> 

Sorry, I wasn't precise about what I meant was the same.  It's correct
that 

	x = (x + FOO) - FOO;

is not the same as the case you show with the cast.  However, if x is of
type 'float', then it is the same as

	x = x + FOO;
	x = x - FOO;

since there is an implicit cast to float at the end of the first
statement.  This is simply the floating point dual of

int a; unsigned char c = 0;

  c = c - 1;
  a = c;

The correct value for a here is 255, not -1; while
  a = c - 1;
sets a to -1 (since the value of c is promoted to int first).


> > The safest solution is probably to compile this one file with
> > -ffloat-store, which is the way gcc recommends working around this
> > problem.
> 
> But that will slow down all operations in that file, not just the one
> operation that needs the extra store/load cycle.

Correct.  Unfortunately, that's the only way that GCC documents for
working around this 'bug'; and therefore the only way that isn't really
trying to play 'confuse the optimizer'.

R.