tech-toolchain: gcc4 misoptimization

Subject: gcc4 misoptimization
To: None <tech-toolchain@netbsd.org>
From: Matthias Drochner <M.Drochner@fz-juelich.de>
List: tech-toolchain
Date: 07/27/2006 21:10:41

Hi -
just found that lrintf() in libm doesn't work well
if compiled with gcc4 on i386. Successice additions/
subtractions of floats are done with double precision
appearently. This shouldn't be done because with single
precision floats a loss of precision can happen. (Which
is deliberately used here to accomplish rounding.)
See the disassembly:
  36:   83 f8 16                cmp    $0x16,%eax
  39:   7f 17                   jg     52 <lrintf+0x52>
  3b:   d9 04 95 00 00 00 00    flds   0x0(,%edx,4)
  42:   89 4d f8                mov    %ecx,0xfffffff8(%ebp)
  45:   d9 45 f8                flds   0xfffffff8(%ebp)
  48:   d8 c1                   fadd   %st(1),%st
  4a:   de e1                   fsubp  %st,%st(1)
  4c:   d9 5d f8                fstps  0xfffffff8(%ebp)

Compiling with -O0 helps, as does the appended patch.
It does not happen with lrint(double), and it also
does not happen on alpha.

Is this a known gcc4 bug? Anyone has a better idea how
to work around this?

best regards
Matthias