Subject: 'prebind' implementation (was Re: HEADS UP: migration to fully dynamic linked "base" system)
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Bang Jun-Young <junyoung@mogua.com>
List: current-users
Date: 08/28/2002 09:56:26
On Mon, Aug 26, 2002 at 11:37:42AM -0400, Thor Lancelot Simon wrote:
> On Mon, Aug 26, 2002 at 04:13:25PM +0100, David Laight wrote:
> > On Mon, Aug 26, 2002 at 04:20:39PM +1000, Luke Mewburn wrote:
> > > 
> > > I will be switching NetBSD-current to have dynamically linked programs
> > > in /bin and /sbin in the next day or so.
> > 
> > The only problem I see is that a lot of scripts will start running
> > significantly slower if the small 'workhorse' programs (ln, rm, chmod,
> > test, mkdir, expr etc) become dynamic.
> > 
> > Some tests I've just done indicate that the time to exec a dynamically
> > linked program is about 7 times that of a static one (on i386).
> > 
> > (the 3.75% improvement I've made is insignificant...)
> 
> This *is* a big deal.  It was one of the major reasons SGI faced a mass
> exodus of customers in the early Irix 6 era: their systems took far longer
> to start up than they ever had before, and sundry trivial shell-script
> tasks (their nice GUI does tend to call a certain number of scripts) 
> similarly got much slower, leading to a general perception that their OS
> was becoming horribly bloated and slow when in fact many kernel operations
> were faster.
> 
> The cause?  You guessed it: dynamically linking everything.  Eventually
> they kind-of addressed this with a very good prebinding implementation,
> but I don't see that on the short-term horizon for NetBSD.
> 
> A certain number of 'workhorse' commands, as David suggests, should remain
> static so that we don't destroy shell script performance.

I'm thinking about implementing 'prebind' utility. A rough idea is:

 - There is a new section in ELF called '.pplt' which stands for
   'prebound plt'.
 - '.pplt' is generated and inserted into binaries by 'prebind'.
 - '.pplt' has resolved symbolic references, so there's no need to invoke
   dynamic linker, ld.elf_so(1), for unresolved references as current
   dynamic linking mechanism works. I expect we can get lots of time
   savings here. 
 - ld.elf_so(1) should be modified accordingly. Once a dynamically
   linked binary is executed, ld.elf_so(1) first starts to analysis if
   '.pplt' in the binary is valid and applicable, and use it instead
   of '.plt'. If not, it falls back to using '.plt' (no performance 
   gain in this case).
 - 'prebind' is mostly based on code from ld.elf_so(1).
 - 'prebind'ing process usually lies just before/after dynamically
   linked binaries are installed in the system (performed by 
   'make build').
 - One of limitations is that '.pplt' should be regenerated every time
   the shared objects the binary depends on are changed. Say there
   was libc.so.12.86 in /usr/lib, 'prebind' resolved symbolic
   references between /usr/bin/whoami and libc.so.12 and inserted 
   '.pplt' to /usr/bin/whoami. A day you installs new libc.so.12.87 in
   /usr/lib without updating userland stuff. Symlink libc.so.12 is
   changed to point to libc.so.12.87 and '.pplt' in /usr/bin/whoami
   is no longer valid. It is still safely executed, but you can't get
   performance gain. You should run 'prebind' against all userland
   stuff which depend on libc.so.12 again.
 - There will be needed some kind of checksum mechanism to identify 
   shared objects. I'm not sure if ELF has such an information as part
   of its specification (I haven't found yet).

Any comments would be welcome and appreciated,

Jun-Young

-- 
Bang Jun-Young <junyoung@mogua.com>