Subject: Re: shared library support
To: None <kre@munnari.OZ.AU, tech-kern@NetBSD.ORG, jiho@postal.c-zone.net>
From: Wolfgang Solfrank <ws@tools.de>
List: tech-kern
Date: 03/18/1998 14:21:20
Hi,

> I welcome your comments on my own test.

I've had a look at your test program, and it's pretty clear that with this
program multiple copies of the shared-lib-version need quite a few pages
per copy, while multiple copies of the static-lib-version require only one
page per copy (user pages that is, kernel space needs some additional pages
per copy, but those are (more or less?) the same on both versions).

Your test program goes out of its way to only use direct system calls from
the C library (the only lib that it uses).  The effect of this is that
your test program doesn't use any data pages when linked statically.
It only has code and stack.  Code is shared, so you only need a new stack
area for starting a new copy of your program.

Now let's analyze the shared case:

One of the design decisions in our shared library implementation is to allow
the address of a library routine to be different for every use of a library.
This also applies to any data and bss in the shared library.

This addresses various design goals:

1. It allows old binary programs to run with newer versions of a library.

2. It provides an easy solution for mixes of libraries (without this, you
would have to register the address of every library in your system in order
to allow binaries to load all or any combination of libraries).

It's especially this second point that makes this design superior to
other shared library implementations (like (early? don't know whether it's
done differently now) linux implementations).

Now in order to achieve this, apart from various other things, the subroutine
calls in question (the system calls in your program) have to be fixed at
program start time (actually even a bit later, see below).  This means that
the page that contains the jump to the library routine has to get the real
target address written into.  Writing to a page means that the page (that
is mapped copy-on-write by the library load process) gets unshared.  In order
to affect as few pages as possible all library calls are made indirectly
through a jump table (so you have only one modification even if you have
several calls to a library routine, and so by (hopefully) clustering the
modifications the number of pages affected is minimized.  And the actual
fixup isn't really done at program start time, but at the time the particular
routine is called the first time.

Anyway, this jump table (that is placed in the data area btw) is now
neccessary for every copy of the program currently running.

Now in addition to this comes the shared linker itself.  While it essentially
is a shared library on its own and thus shares its code between its various
instances, it has some data to organize its duties.  This data area is
unshared as well (despite the fact that it probably contains the exact same
data for all the copies of your program that happen to run at the same time).

This all (together with the unshared stack that you also have in the static
case) should pretty much add up to the amount of memory required per copy
of your program running at the same time.

Hope this helps.  If you have any further questions, feel free to ask.

Ciao,
Wolfgang
-- 
ws@TooLs.DE     (Wolfgang Solfrank, TooLs GmbH) 	+49-228-985800