Subject: Re: -mieee option
To: None <current-users@netbsd.org, fair@clock.org>
From: Ross Harvey <ross@ghs.com>
List: current-users
Date: 01/14/1999 20:37:47
> From: "Erik E. Fair" <fair@clock.org>
> Subject: Re: -mieee option
>
> Perhaps we could encourage a college or university to use NetBSD as a case
> study of floating point implementation for a Computer Science course in
> Numerical Analysis. If they audit and verify our implementations, we'd get
> a serious checkout, and they'd have a target to learn from...

Ack! No, please...

It's the consumers of floating point cycles that matter, here.

Also, the rationale for the compromises made on the RISC processors
is very much rooted in the difference between engineering and theory.

To play devil's advocate for a second: the argument _for_ the most costly
ieee fp "feature", ops on denorms, runs like this:

	it would be nice if certain mathematical identities held down to
	the 10^-38 format limit, which is another way of saying that it
	would be nice if the difference between calculations on arbitrary
	quantities differed by no more than the unavoidable difference between
	two representable quantities ... i.e., roundoff error

		so, ideally, within roundoff error:
			[1] (x-y) + y ~= x
			[2] iff x != y then x - y != 0
			[3] 1/(1/x) ~= x

	If you underflow to zero instead of to a denorm, some of these
	guarantees start to break down, for certain test cases, at around
	10^-31, whereas with gradual underflow (doing ops on denorms) you
	can preserve them down to 10-38 (but not below, contrary to popular
	belief, ieee FP doesn't get you nice arithmetic properties on data
	below the format limit)

OK, that's the argument _for_. It has a certain theoretical appeal, but an
engineer might say any or all of the following:

	* in the real world, the data is rarely valid beyond two or three
	  significant figures anyway, and _never_ to the last bit, besides,
	  other things are hurting the last bit anyway

	* you are just moving the limit from 10^-31 to 10^-38, if 10^-31 isn't
	  good enough, 10^-38 probably isn't either, and if it is, just go
	  to double precision and call it a day

	* if underflow is such a problem, you need another bit in the exponent

	* if precision or exponent range is such a problem, the calculation
	  should be done in (duhhh) double precision, giving you an exponent
	  range of 10^-308 (!)

Furthermore, there is an unavoidable correlation between the number of bits
in the operands and the cost and speed of the functional units. Denorms
effectively require much wider normalized operands. An engineer might also
say: for the cost and time penalty of doing _that_, I could give you 128-bit
arithmetic in HW.

Sigh, I hope I haven't put everyone to sleep. Please rest assured that your
faithful NetBSD developers do understand something on the subject...hey, wake
up! 

	Ross.Harvey@Computer.Org