tech-toolchain: Re: gcc4 misoptimization

Subject: Re: gcc4 misoptimization
To: None <tech-toolchain@NetBSD.org>
From: Alan Barrett <apb@cequrux.com>
List: tech-toolchain
Date: 08/01/2006 14:45:11

On Mon, 31 Jul 2006, Richard Earnshaw wrote:
> If you look at the most common 'bugs' reported on GCC
> (http://gcc.gnu.org/bugzilla/duplicates.cgi), you'll see that this one
> comes out at number 4 (bug 323).  What's more, it's the only one in the
> top ten that isn't closed.

It's not clear to me that that's exactly the same issue, but it does
seem to be at least closely related.  Comment number 70 on that bug
report is revealing:  It seems to say that it is a real gcc bug that
will probably never be fixed.

> Your attempt to work around the problem using a cast isn't guaranteed
> to work in future releases of the compiler (the cast is really a NOP);
> the volatile suggestion is slightly better, but it will slow things
> down on every CPU.

I don't think the cast is a NOP (per the C99 standard, regardless of
what gcc does).  Section 5.2.4.2.2 paragraph 8 of the C99 standard gives
the implementation permission to evaluate floating point expressions in
a format that's more accurate than one might expect, and with just "x =
x + FOO - FOO" that permission would apply.  However, section 5.1.2.3
paragraph 12 (example 4) says that casts and assignments must do the
right thing even if the implementation is using what they call "wide
registers", so "x = (float)(x + FOO) - FOO" is different.

> The safest solution is probably to compile this one file with
> -ffloat-store, which is the way gcc recommends working around this
> problem.

But that will slow down all operations in that file, not just the one
operation that needs the extra store/load cycle.

--apb (Alan Barrett)