Subject: Re: Preliminary test of i386 kernel compiling with GCC 4.0
To: Simon Burge <simonb@wasabisystems.com>
From: Vincent <10.50@free.fr>
List: tech-kern
Date: 10/08/2004 13:39:44
Hi,
> Interesting that you saw 4.0 being slower that 3.3.3. Here's some
> numbers for gcc floating-point performance, using the time for glucas (a
> FFT-based prime number testing program) self tests in order from fastest
> to slowest:
>
> icc 8.0: 1511.657u 0.309s 25:26.36
> gcc 3.4: 1779.724u 0.239s 29:57.07
> gcc 4.0exp: 1845.911u 0.249s 31:03.67
> gcc 3.3.3: 2192.964u 0.249s 36:54.81
> gcc 3.5exp: 2259.040u 0.239s 38:01.40
If I understand your figures correctly, 4.0 is *at that time* less
efficient than 3.4.2, despite the new optimizer.
Well, I suppose that the optimizer has little to do with FPU-related
code, whereas ICC has already a good scheduler for it, and I suspect ICC
to use both the 387 and SSE unit in parallel, something I daren't try
yet on GCC. And then, ICC, if I remember, does multiple file
optimization, something GCC does not.
Also, I suspect bladeenc to lose time in the interfaces with the kernel.
I faintly fancied that the time lost was due to spurious waits during
disk accesses, but I can't be sure about that. I see no other reason why
the same program would become twice slower.
That's why I wanted to compile the kernel, being _IMHO_ the best integer
based code available. But the kernel won't run, even if I use NOGCCERROR
and use gcc 3.4.2 where the 4.0.0 fails. As I said, it obviously fails
in the driver attach code or something.