Subject: Re: PROBLEMS WITH GCC 2.7.2/I386 ON APPLICATIONS
To: None <buhrow@cats.ucsc.edu>
From: Niklas Hallqvist <niklas@appli.se>
List: current-users
Date: 04/11/1996 08:41:21
>>>>> "Brian" == Brian Buhrow <buhrow@cats.ucsc.edu> writes:

Brian> There has been quite a bit of talk about the
Brian> optimization code in gcc 2.7.2 and its guilt in a number of
Brian> problems compiling and using the kernel.

"A number of problems"?  This statement startles me.  Actually I have
become unsure if we have *actually* seen the problem at all in the
kernel.  I seem to recall that pcvt was reported to fail with -O2 and
without -fno-strength-reduce, but I have lost track of where I saw
that.  I could have been imagining.  However I try to watch closely
when GCC is mentioned and I can't remember any other problem.  The bug
is well-known and has been in GCC for five years or so but is very
seldom triggered.  It has to do with mixed-signedness arithmetic and
boundary conditions in strength reduction.  It seems a 2.7.3 will come
out soon to make a stab at fixing this.

I'd be interested to hear of specific other problems.

Brian> I am having problems,
Brian> which appear to be gcc related, and am wondering if this gcc
Brian> 2.7.2 bug is more serious than we first thought.  I have an
Brian> application which works fine if all optimization is turned off,
Brian> but which gets its function return values screwed up if -O or
Brian> -O and -fstrength-reduce are turned on.

This fact in itself doesn't necessarily point out that GCC is at
fault.

Brian> The problem seems to
Brian> be some code in the cleanup routine which stomps on %ax when a
Brian> function is returning.  Not being a compiler expert, I'm not
Brian> sure what's going on here, but instruction-by-instruction
Brian> traces show that values are getting corrupted when functions
Brian> are returning.  Things aren't consistent, but it appears as if
Brian> things go south when ever the stack crosses a page boundary.  I
Brian> believe I'm also having trouble with this in the libc code as
Brian> well.  Ordinary wel-tested functions like: getpwent are bombing
Brian> when compiled with -O and working just fine, with -o turned
Brian> off.

Given this explanation, I'm very sceptic.  To me it seems more like vm
is the culprit than GCC (based on the page boundary condition).  I'd
be happy to look over a function (preferrably small) that you think
miscompiles.  Esp. if you can accompany it with a trace that shows
where the problem occurs.

Brian>  I'm running -current of about February 27, or so, just
Brian> before the bus changes went into the code.  Oh, yes, this is on
Brian> the i386 port, though I don't know if this bug is machine
Brian> specific.  Perhaps it would be useful either to change the
Brian> default compilation flags, or to look at backing up several
Brian> revs of gcc in the distribution.

Ehh, I think this is *much* to loose ground to do such a thing.

Niklas

Niklas Hallqvist       Phone: +46-(0)31-40 75 00  Home: +46-(0)31-41 93 95
Applitron Datasystem   Fax:   +46-(0)31-83 39 50  Home: +46-(0)31-41 93 96
Molndalsvagen 95       Email: niklas@appli.se     GSM:  +46-(0)70-714 10 35
S-412 63  GOTEBORG     WWW:   Here
Sweden		       IRC:   niklas (#NetBSD)