Subject: Re: ns32k toolchain
To: Perry E. Metzger <perry@piermont.com>
From: Ian Dall <ian@beware.dropbear.id.au>
List: port-pc532
Date: 08/08/2002 23:26:03
"Perry E. Metzger" <perry@piermont.com> writes:

> Ian Dall <ian@beware.dropbear.id.au> writes:
> > I took a slightly different tack for the pc532. We scale de-normal operands
> > so they won't trap, do the operation and rescale.
> 
> Er, doesn't that violate the whole point of doing arithmetic on
> denorms?
> 
> Or am I misunderstanding what you're doing?

You must be! I get the right answers (at least with the fixed version
I'm about to put in).

Suppose you have A and B which are denorms. Count the number of
leading zeros in the mantissa.  Left shift the mantissa by the number
of leading zeros plus 1 (for the hidden bit). Set the exponent to the
bias value (arithemtically to zero). Call the amount scaled s1 and
s2. Now do the multiplication with the right rounding mode, there is
plenty of dynamic range so there can't be any over or underflow. Now
we need to rescale the result by s1 + s2. If s1 + s2 is more that the
exponent, we will have to generate a denormal answer. Examine the part
of the mantissa to be discarded, to determine any lost precision and
whether to round. Now do the right shift and round.

Addition and subtraction are conceptually similar, but both operands
must be scaled the same amount. (Make the larger operand have order
of magnitude 1).

I had a few off-by-one errors in the shifting so it is conceptually
simple but a bit tricky to get all the details right.

This does mean that the kernel must be able to use the fpu. However,
since you need the user fpu state to do any emulation and must restore
it afterwards, there is no great overhead. A context switch in the
middle of the trap handling could be an embarrassment, but I don't
think that can happen.

Ian