tech-toolchain: Re: gcc3 "millicode" problems with sh3 (toolchain/22452)

Subject: Re: gcc3 "millicode" problems with sh3 (toolchain/22452)
To: None <tech-toolchain@netbsd.org, port-sh3@netbsd.org>
From: Valeriy E. Ushakov <uwe@ptc.spbu.ru>
List: tech-toolchain
Date: 09/03/2003 16:45:52

On Wed, Sep 03, 2003 at 13:34:04 +0100, David Laight wrote:

> > To solve this problem we need a static library with PIC code for all
> > such functions, marked as ".hidden", we then need to link all shared
> > libraries against this millicode library so that all each shared
> > library gets its own private __udivsi3 &c that it can call directly.
> > We also need a copy of __udivsi3 &c in libgcc.a.
> 
> Why does that help?  Or is it some special feature of .hidden?

To quote as.info in ".hidden":

   This directive overrides the named symbols default visibility (which
is set by their binding: local, global or weak).  The directive sets
the visibility to `hidden' which means that the symbols are not visible
to other components.  Such symbols are always considered to be
`protected' as well.

And for ".protected":

   This directive overrides the named symbols default visibility (which
is set by their binding: local, global or weak).  The directive sets
the visibility to `protected' which means that any references to the
symbols from within the components that defines them must be resolved
to the definition in that component, even if a definition in another
component would normally preempt this.

So all calls to __udivsi3 from within a shared library will go to its
own .hidden copy of __udivsi3, and such calls will satisfy the
assumptions gcc makes (i.e. will not clobber any additional
registers).

> Usually references to non-static functions in shared libraries are
> always called via PLT - because the version (if any) in the user
> program should be used in preference to the one in the library.

Sure, but __udivsi3 is not a normal library function.  It's not even a
function, really as it doesn't follow a function call conventions
(which is the crux of the problem).

> Actually, it ought (surely) be possible to write PLT code that can
> jump to the correct function without requiring any extra registers?
> (or is the sh object code particularly brain-dead?)

The call can only clobber r4, but r4 contains the first argument at
the time of the call, so we are left with r0 only (that will contain
the return value upon return) - which, as far as I understand, is not
enough.

SY, Uwe
-- 
uwe@ptc.spbu.ru                         |       Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen