Subject: gcc4 misoptimization
To: None <email@example.com>
From: Matthias Drochner <M.Drochner@fz-juelich.de>
Date: 07/27/2006 21:10:41
just found that lrintf() in libm doesn't work well
if compiled with gcc4 on i386. Successice additions/
subtractions of floats are done with double precision
appearently. This shouldn't be done because with single
precision floats a loss of precision can happen. (Which
is deliberately used here to accomplish rounding.)
See the disassembly:
36: 83 f8 16 cmp $0x16,%eax
39: 7f 17 jg 52 <lrintf+0x52>
3b: d9 04 95 00 00 00 00 flds 0x0(,%edx,4)
42: 89 4d f8 mov %ecx,0xfffffff8(%ebp)
45: d9 45 f8 flds 0xfffffff8(%ebp)
48: d8 c1 fadd %st(1),%st
4a: de e1 fsubp %st,%st(1)
4c: d9 5d f8 fstps 0xfffffff8(%ebp)
Compiling with -O0 helps, as does the appended patch.
It does not happen with lrint(double), and it also
does not happen on alpha.
Is this a known gcc4 bug? Anyone has a better idea how
to work around this?