NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: lib/58742: Sign of Zero Issue with libc fma(): Positive Zero Returned Instead of Negative



I started adding FreeBSD's fma(3) tests, and adapted this one to use
slightly less magic constants (so we can do the same for float and
long double):

	volatile double a = ldexp(1 - DBL_EPSILON, (DBL_MIN_EXP - 1)/2);
	volatile double b = ldexp(1 + DBL_EPSILON, (DBL_MIN_EXP - 1)/2);
	volatile double c = -DBL_MIN;

That is,

	a = 0x1.ffffffffffffep-512
	b = 0x1.0000000000001p-511
	c = -0x1p-1022

for which a*b + c is a very small number below zero, -0x1p-1126, much
closer to zero than to the smallest negative subnormal, -0x1p-1074.

Stepping through fma(3) with gdb reveals that for these inputs, we end
up at:

    268 	if (r.hi == 0.0) {
    269 		/*
    270 		 * When the addends cancel to 0, ensure that the result has
    271 		 * the correct sign.
    272 		 */
    273 		fesetround(oround);
    274 		{
    275 		volatile double vzs = zs; /* XXX gcc CSE bug workaround */
    276 		return (xy.hi + vzs + ldexp(xy.lo, spread));
    277 		}
    278 	}

https://nxr.netbsd.org/xref/src/lib/libm/src/s_fma.c?r=1.7#268

The local variables at this point are [hexadecimal notation added]:

(gdb) info locals
xs = 0.99999999999999978 [0x1.ffffffffffffep-1]
ys = 0.50000000000000011 [0x1.0000000000001p-1]
zs = -0.5
adj = <optimized out>
xy = {hi = 0.5, lo = -2.4651903288156619e-32 [-0x1p-105]}
r = {hi = 0, lo = 0}
oround = 0
ex = -511
ey = -510
ez = -1021
spread = -1021

Summing xy.hi + zs yields 0 because xy.hi = zs, and scaling
ldexp(xy.lo, spread) is rounded to zero, so it's zero plus zero plus
zero.  The trouble is that the true sum (without rounding in ldexp)
isn't actually zero, so we're not rounding correctly to plus or minus
zero -- or, in non-default rounding modes like FE_DOWNWARD, away from
zero.

It's tempting, in this case, to try something like copysign(0, xy.lo)
or, to handle rounding modes other than FE_TONEAREST, something like
copysign(DBL_DENORM_MIN, xy.lo)/2, but that breaks some other cases.


Home | Main Index | Thread Index | Old Index