Subject: gcc bug rendering Postgresql unusuable on Alpha
To: None <port-alpha@netbsd.org>
From: Caffeinate The World <mochaexpress@yahoo.com>
List: port-alpha
Date: 03/28/2003 15:01:51
> On Thu, 27 Mar 2003, Tom wrote:
>
>> > On Thu, Mar 27, 2003 at 11:42:35AM -0600, Tom wrote:
>> >> Postgresql people think it could be due to NetBSD's GCC or 64-bit
>> problems. Also this is a concern for NetBSD / Alpha because 17 out
>> of 89 regression tests failed on the Postgresql package. Most of
>> those failures caused postmaster backend crashes.
>> >
>> > Ok, I forwarded your mails to a developer with an alpha machine, I
>> hope he has time to take a look at it.
>>
>> Please also let him know that the Postgresql package (7.3.2) causes
17
>> of 89 regression tests failures, but 7.4-snapshot (not a package)
>> eliminated all but one of those issues.
>>
>> Thomas
>>
>
> Given that 7.4 doesn't have most of those bugs its a good chance it
was
> a postgresql bug.  One thing you might try is compiling without
> optimization.  Although I haven't come across that problem on alpha,
> I've seen -O2 produce non-working code on m68k and sparc before
> (althrough quite rarely).
>
> I've found that many 64 bit bugs can be found by investigating all
> compiler warnings.  When you start seeing things like 'cast from
pointer
> to iteger of different size', that often times indicates bugs.  Also
> warnings like 'type mismatch with implicit declaration of foo' where
foo
> is something like strcmp, memset, etc. usually means they forgot to
> include string.h
>
> Another favorite stupid programming trick is:
>
> #ifdef __osf1__
> /* do alpha specific stuff here */
> #endif
>
> or
>
> #ifndef __i386__
> /* we _must_ be on a big endian machine */
> #endif
>
> I'm pretty much way overextended right now so I probably won't get a
> chance to look into this much.  On the off chance I do get a few
> minutes, do I need to do anything special to build the 7.4 snapshot?
>
> -Dan

Tom Lane from Postgresql had access to my Alpha box to check out some
issues  with running Postgresql on NetBSD / Alpha.  Here is what he
found:

---
>   ERROR:  datumGetSize: Invalid typLen 0

AFAICT, this is nothing more nor less than a compiler bug: an int16
variable in get_var_maximum is being passed to datumCopy, which
declares its argument as type int.  Inside get_var_maximum, gdb shows
the int16 variable as having value 64, which is correct (the variable
in question is of type NAME, so that is the right length for it).  But
datumCopy is receiving a value of zero.  It appears there's an error in
gcc that makes it do the int16->int widening incorrectly in this
particular case.

We could maybe defend against this particular case by inserting an
explicit cast, but it certainly wouldn't be practical to put casts
into the many other places where function arguments are supposed to be
coerced to the right size.  I'd counsel trying to get the compiler
bug fixed.  2.95.3 is kinda old; maybe there is a later release that
fixes the problem...

                        regards, tom lane
---

Can this be fixed in the gcc version we're using?

Thomas

__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com