Subject: Re: Floating point performance
To: Neil A Carson <neil@IVISION.CO.UK>
From: Ale Terlevich <A.I.Terlevich@DURHAM.AC.UK>
List: port-arm32
Date: 06/28/1996 18:32:40
> Hello all,
> 
> I am looking at ways to optimise the FPE at the moment, and have a few
> ideas.  However before I go ahead I need to know whether or not it can
> really be optimised that much.
> 
> Can someone/has someone done a comparison with the speed of the RiscOS
> FPE? If so can I have the result/s please so I can work out whether
> it is really worth bothering or not.

  Well. I had a go a while back at this, but I'm not sure exactly where 
the timing info goes when using RiscBSD.

  Basically I ran the Flops20 program using the -DUNIX timing option, but 
this only counts up the user time, so I added the system time too.  This 
gives a performance of about half that of RiscOS, however I did notice
using top that while running this program the CPU seems to spend half of 
its time processing interrupts!

  Anyway, if anyone knows what's happening it's you! So here's the output...

  Ale.



  RiscOS: Cv4

   FLOPS C Program (double Precision), V2.0 18 Dec 1992

   Module     Error        RunTime      MFLOPS
                            (usec)
     1     -2.6610e-12    130.0049      0.1077
     2     -8.8818e-16     68.9697      0.1015
     3     -1.7022e-10    175.7812      0.0967
     4     -2.9477e-10    150.7568      0.0995
     5      1.0213e-09    330.8105      0.0877
     6     -1.9149e-10    316.1621      0.0917
     7      3.4220e-08    128.7842      0.0932
     8     -5.5319e-10    308.8379      0.0971

   Iterations      =      16384
   NullTime (usec) =     0.0000
   MFLOPS(1)       =     0.0999
   MFLOPS(2)       =     0.0939
   MFLOPS(3)       =     0.0947
   MFLOPS(4)       =     0.0956

RiscBSD: gcc -O3 -fomit-frame-pointer

   FLOPS C Program (Double Precision), V2.0 18 Dec 1992

   Module     Error        RunTime      MFLOPS
                            (usec)
     1      5.8975e-13    217.6802      0.0643
     2     -9.3259e-15    135.5936      0.0516
     3     -2.9330e-12    282.7322      0.0601
     4     -4.9855e-12    238.2559      0.0630
     5      1.7508e-11    464.3744      0.0624
     6     -3.2840e-12    444.7869      0.0652
     7      5.5411e-10    181.9911      0.0659
     8     -9.4858e-12    465.0995      0.0645

   Iterations      =     125000
   NullTime (usec) =     0.1487
   MFLOPS(1)       =     0.0541
   MFLOPS(2)       =     0.0640
   MFLOPS(3)       =     0.0636
   MFLOPS(4)       =     0.0636