Subject: Re: RFC: migration to a fully dynamically linked system
To: Todd Vierling <tv@wasabisystems.com>
From: Wolfgang Solfrank <ws@tools.de>
List: tech-userlevel
Date: 01/07/2002 18:11:48
Hi again,

> Now, it would be possible to use "-E" to make "static" binaries that can
> dlopen() modules.  You'd still have to link against something like the
> "libplacebo" I created, but dynamic libc wouldn't be necessary.  (To see
> this in action, try my shar example all three ways with LDFLAGS=-Wl,-E ...)

The need to link against "libplacebo" is the result of ld not
creating the dynamic symbol section for static binaries.  This could
easily be changed.  I'd even go as far as calling it a bug that ld
doesn't create this section when using the "-E" switch, as this switch
explicitly tells ld that I do want a dynamic symbol section (note that
I didn't test this).

> However, if you use the "-E" switch, you must not link against any static
> library which may also be dynamic.  The reason for this is that any internal
> global symbol to the shared object may clash with a dynamic symbol in the
> main binary -- and it's not well documented as to which one will "win".

That's not that different from linking a main program against a
shared library which has an internal global with the same name as a
global in my main program.

> As a straw example, think about internal symbols in the resolver.  The main
> binary may call gethostent(3), whereas the module may call getaddrinfo(3).
> Both functions depend on a "global" _gethtent(), which is only global for
> the sake of being used across two source files in libc.  If the ABI for
> _gethtent() ever changes, this symbol clash will wreak all kinds of havoc to
> the binary doing dlopen().

Hmm, I'm not sure what you are trying to tell here.  (As an aside, I had
problems finding gethostent it is in our sources, but isn't compiled
normally.  And even then, getaddrinfo.c has its own static copy of
_gethtent(), which is even somewhat different from the one in
gethnamaddr.c.)

If you are pointing to the problem that changing the ABI of a global
symbol requires at least relinking any of those "static" binaries,
then yes, I think I agree.

> [*] Even so, if the programs are dynamically linked, what's the difference
>     between going through all these linking headaches and simply linking
>     against a shared libc?  In the former case, you still have to link
>     with a library that may be corrupted (even if just a "libplacebo"), and
>     in the latter case, you gain the ability to fix libc bugs simply by
>     replacing libc.

For the need to link against "libplacebo", see above.

It should be quite easy to write a static dlopen() which would, on its
first call, load ld.elf_so and initialize it before loading the actual
shared object given as argument.  That way, the main binary would be
able to run without the dynamic linker or any shared object.  Only
if/when it accesses some additional functionality, these things would
become neccessary (and there should be appropriate workarounds in place
if either of them cannot be loaded).

> : With our current system, there is a subtle difference between
> : shared objects against which a program is linked and those
> : which get dlopen()ed.  For the latter ones, you cannot
> : overwrite symbols in the main program used by the shared
> : object.  Actually, you can, if the symbol in question is also
> : defined in a shared object against which the program is linked.
> 
> I'm not sure what you mean by "overwrite".
> 
> If you're saying that referenced symbols in the dlopen()ed object will link
> first against the binary, that's correct.  To reverse this behavior, you use
> -Bsymbolic on the shared object to be dlopen()ed.

They link only against the symbols in the binary which happen to be
in the dynamic symbol section, which currently means only those that
are also in some shared library the binary is linked against.

Other symbols in the binary will not be found by the shared
object, which means that the shared object will reference its own
copy of such symbols.  There is nothing you can do to the shared
object to change this.  As you said above, you'd have to link the
binary with "-E" to make this work.

Ciao,
Wolfgang
-- 
ws@TooLs.DE     Wolfgang Solfrank, TooLs GmbH 	+49-228-985800