Subject: re: Mozilla on MacPPC, and add on PRs 8669, 10455, and 10908,
To: Martin Husemann <martin@duskware.de>
From: M L Riechers <mlr@rse.com>
List: tech-toolchain
Date: 09/23/2000 17:43:20
Hi, my name is Mike, and I am an idiot.

I, too, tried to make mozilla on MacPPC, for:

A) uname -m
macppc
A) uname -n
t982.rse.com
A) uname -p
powerpc
A) uname -r
1.5_ALPHA2
A) uname -v
NetBSD 1.5_ALPHA2 (EASTERN-1.5_ALPHA2) #0: Mon Aug 28 21:26:18 EDT 2000
mlr@t982.rse.com:/usr/local/NetBSD/current/src/sys/arch/macppc/compile/EASTERN-1.5_ALPHA2

and,

.
.
.
===> Installing for mozilla-5.0m17
===> Becoming root@t982.rse.com to install mozilla-5.0m17.
/usr/bin/su /usr/bin/sed -e "s,@PREFIX@,/usr/X11R6,g" /usr/local/src/pkgsrc/www/mozilla/files/mo
zilla-ELF.in > /usr/X11R6/bin/mozilla
/bin/chmod 755 /usr/X11R6/bin/mozilla
install -d -o root -g wheel -m 555 /usr/X11R6/lib/mozilla
( cd /usr/local/src/pkgsrc/www/mozilla/work/mozilla/dist/bin;  tar chf - . | (cd /usr/X11R6/lib/
mozilla; tar xpf -) ;  cd /usr/X11R6/lib/mozilla;  /usr/bin/env MOZILLA_FIVE_HOME=/usr/X11R6/lib
/mozilla ./regxpcom )
/usr/X11R6/lib/mozilla/libxpcom.so: Unsupported relocation type 10in non-PLT relocations

*** Error code 1
.
.
.

> Is this a borked -rpath/-Wl,Wr thing or a -pic vs. -PIC thing?
> Or something completely different?

Yes, sort of.

See PR#8669, "pkg/8669: qt programs using pkglibtool don't work
(linker/loader bug?)", also current-users from about 23 to 28 October
1999. David Barr did a rather nice test sequence (involving the qt
package) showing where the problem lay:  (He was working on sparc.)

> /usr/X11R6/lib/libqt.so.1: Undefined symbol "" (reloc type = 12, symnum = 4)

which is appropriate for sparc machines.

This was followed up by Paul Kranenburg (pk@cs.few.eur.nl)

> ...with: This error seems to pop up with every shared library built
> from C++ sources by using the `libtool' to construct the link
> command to build the library from (PIC) object files.  The generated
> link command includes `-lgcc' -- which is a static (non-PIC) library
> -- that should preferably not be mixed in with the PIC object files
> of the shared library being built.

David Barr then offered a fix:

>  I can work around the problem with this diff to pkglibtool
> 
> 
>  It basically says to use 'ld' instead of 'cc' to link, since
>  cc silently adds -lgcc which breaks C++ shared libs.
> 
>  (patch followed...)

I applied more or less of the patch to my pkglibtool, and was then
able to compile and _use_ the qt package and kde, and a host of other
stuff.

  The problem seems to be that the "Undefined symbol "" (reloc type =
12, symnum = 4)" or, in our case, "somelibx.so: Unsupported relocation
type 10in non-PLT relocations" refer to valid ELF relocation types,
but only for "static" linkage: i.e. they have to be resolved in the
link step, and disappear from whatever symbol tables, before the
module they're incorporated can be used as a run-time loadable.  In
any case, check out the source to ld.elf.so in
src/libexec/ld.elf_so/reloc.c.  If the relocation type were to be
dealt with dynamically, it would have to be here.

One solution, I suppose, would be to make ld.elf.so recognize type 10
relocations.  (FWIW, a type 10 relocation refers to a medium-long
relative branch, as I remember.  Not the kind of thing you'd be
pleased to relocate dynamically.) But I haven't delved into this, or
ELF, enough to know if this is unwise.  I highly suspect it is.

Before I made my patches, I traced down ld.elf_so, and found that,
indeed the libraries built with g++ front end abounded with type 10
relocations.

Interestingly, non c++ libraries built from gcc don't seem to have
this problem.  Don't know for sure, though.

I believe PRs 10455 and 10908 may relate to this problem.  arm32 seems
to have intermittent difficulties, while port-alpha users are
complaining of intermittent failures using pine: "port-alpha/10455:
Occasional "Undefined PLT symbol" messages from pine".

Also, there were a rash of PLT related core-dumps discussed on
current-users some time ago.  My guess is that they are related as
well.

On Wed, 13 Sep 2000 23:02:21 +0100, Nick Hudson
<nick@nthcliff.demon.co.uk> wrote to tech-toolchain on the subject
"Shared libraries and libgcc", very pertinent to this discussion, that
I'll just quote verbatim:

> I'm trying to fathom out if shared libraries need libgcc or not. I've
> collected the following information:

> i)      libgcc isn't available in pic/PIC format
> ii)     bsd.lib.mk doesn't include libgcc when linking
> iii)    gcc.info has the comment
> 
>         `LIBGCC_SPEC'
>              Another C string constant that tells the GNU CC driver program how
>              and when to place a reference to `libgcc.a' into the linker
>              command line.  This constant is placed both before and after the
>              value of `LIB_SPEC'.
> 
>              If this macro is not defined, the GNU CC driver provides a default
>              that passes the string `-lgcc' to the linker unless the `-shared'
>              option is specified.
> 
> iv) gcc -v -shared file.o produced the following
> 
>         $ gcc -v -shared file.o
>         Using builtin specs.
>         gcc version egcs-2.91.66 19990314 (egcs-1.1.2 release)
>          /usr/libexec/collect2 -m elf_i386 -shared /usr/lib/crtbeginS.o
> -L/usr/libexec file.o -lgcc -lc -lgcc /usr/lib/crtendS.o
> 
> v)      http://gcc.gnu.org/gcc-3.0/libgcc.html suggests that shared libraries
> need libgcc.
> 
> Basically I'm confused - can anyone help?
> 
> Thanks,
> Nick

He (or maybe it is just I) didn't get an answer.  I'm particularly
concerned about the URL http://gcc.gnu.org/gcc-3.0/libgcc.html he
refers to, because it details "plans regarding libgcc for the GCC 3.0
release", apparently recommending -lgcc be always loaded.

Now, (bearing in mind that I Am Not an Expert), at first blush it
would seem to me that -lgcc should _not_ be included in shared
libraries, but always be included in programs (al la crt++...).  If
-lgcc is needed, it would be statically linked to the (running)
program (Oh dear, terminology becomes so elusive when you're dealing
with so many "dynamics") so that if a shared library "needs" -lgcc
services, it can get them from the running program.  It would seem
that that is what the phrase "...passes the string `-lgcc' to the
linker unless the `-shared' option is specified...."  is trying to get
at.  (But, why -lgcc is indeed passed to the linker _even_when_
"-shared" is specified escapes me).  Alternatively, I suppose, one
might argue that _no_ (running program) or shared library should
_incorporate -lgcc, but that -lgcc should be its own shared library
(and come to think of it, I guess that's exactly what one approach
does argue).

Well, the whole "Unsupported relocation type (whatever) in non-PLT (or
any other type or source) relocations" thing seems to be a major
problem.  My guess is that it has cropped up in at least macppc,
arm-32, i386, and alpha machines (and probably any other NetBSD ELF
c++ builds using shared libraries).  Why isn't the problem the dead
show-stopper that it is on macppc on other architectures?  I might
hazard a couple of guesses: our biggest port, i386, is not totally
elf; and the problem may be library order dependent.

I know that the tech-toolchain people are working very hard on, if not
this problem, many important problems, but I wonder if someone who is
knowledgeable in this area might comment, or, perhaps have a
work-around?


But back to mozilla.  As I said, Hi, I am an idiot.  An idiot for yet
again attempting to make mozilla.  Many times I've attempted mozilla
-- no joy.  (But at least this time I (we) got an actual error
message, instead of a mysterious crash and dump.  So I guess that's
progress.  Of a sort.)  I briefly looked into how to go about properly
linking the moz modules, and concluded that the easiest thing would be
just to cut and paste 40 or 50 or 100 (whatever, I forget) link steps
from using g++ to ld, (with proper add or delete -W thingies,) and
hand-link them.  I couldn't find an easy change to a Makefile, for
instance, to make it all better.  Maybe someone else who knows a
little more about autoconf, config, etc. can.  So for now, I'm
stuck. So close.  :-(

> After fixing a stupid mistake in the source...

This have something to do with backing out the powerpc hack in
nsprpub/pr/src/io/prprf.c like:

*** nsprpub/pr/src/io/prprf.c.orig      Thu Jul 15 20:30:32 1999
--- nsprpub/pr/src/io/prprf.c   Sat Sep  9 17:34:05 2000
***************
*** 31,46 ****
  #include "prlog.h"
  #include "prmem.h"
  
  /*
  ** Note: on some platforms va_list is defined as an array,
  ** and requires array notation.
  */
! #if (defined(LINUX) && defined(__powerpc__)) || defined(WIN16) || \
!     defined(QNX) || (defined(__NetBSD__) && defined(__powerpc__))
  #define VARARGS_ASSIGN(foo, bar) foo[0] = bar[0]
  #else
  #define VARARGS_ASSIGN(foo, bar) (foo) = (bar)
  #endif
  
  /*
  ** WARNING: This code may *NOT* call PR_LOG (because PR_LOG calls it)
--- 31,45 ----
  #include "prlog.h"
  #include "prmem.h"
  
  /*
  ** Note: on some platforms va_list is defined as an array,
  ** and requires array notation.
  */
! #if (defined(LINUX) && defined(__powerpc__)) || defined(WIN16)
  #define VARARGS_ASSIGN(foo, bar) foo[0] = bar[0]
  #else
  #define VARARGS_ASSIGN(foo, bar) (foo) = (bar)
  #endif
  
  /*
  ** WARNING: This code may *NOT* call PR_LOG (because PR_LOG calls it)

I was wondering about that,  and what effect (the hack, or backing it
out, either one) would have on moz.

BTW, I believe PR #10692 has nothing to do with the PLT type 10
problem, and should be closed:

> >Description:
> I try compile KDE enviroment under macppc (PowerMac 8500/120 2GB
> HDD 64MB RAM).  It was unsuccessfull, compilation abort at compiling
> KAB API (addressbook API).

Compiling "addressbook API" always requires more than 32M to compile:

> g++ -DHAVE_CONFIG_H -I. -I. -I.. -I../kdecore -I../kdeui -I../kfile -I../kfmlib -I/usr/X11R6/include/qt -I/usr/X11R6/include -O -I/usr/local/include -I/usr/local/include -c -fPIC -DPIC addressbook.cc
> addressbook.cc: In method `bool AddressBook::nameOfField(const class string &, class string &)':
> addressbook.cc:1238: virtual memory exhausted
> gmake[2]: *** [addressbook.lo] Error 1

A)ulimit -d65536

seems to work OK.  And, incidentally, this very problem was discussed
on (I believe) port-macppc. I wonder if powerpc specifically, and risc
architectures in general, ought to have higher memory default limits?

-Mike