Subject: shared library support
To: None <tech-kern@NetBSD.ORG>
From: None <jiho@postal.c-zone.net>
List: tech-kern
Date: 03/18/1998 13:40:24
Thanks especially to Chuck for a real effort to enlighten.  But we need to
look at 'systat vmstat', because its output seems to correlate well with
what happens when the system overruns available RAM.  In other words, the
kernel seems to be working with numbers close to what you see in the output
from 'systat vmstat'.

Besides, Chuck, you are using UVM, and I am using the old Mach vm.  And I don't
have your new function (although it would be very useful).  But if UVM does fix
this, that would be nice.  (Or if someone can explain this in a more benign
light...)

Let's start over, and focus clearly on my own first-hand information.  Please
be patient until further down, where discussable results appear.

The assertions are (for 1.2/i386, with Mach vm):

  1. Code pages are shared from an executable file, but are not shared from
     shared libraries linked in such an executable.

  2. As a result, you are better off running the entire system statically
     linked, the existing shared library support being severely dysfunctional
     and counter-productive.

  3. I have a test which demonstrates this.

First the test procedure, then a sample run of (my own) results, and finally
the source for the test program.  Yes, there are numerous bug fixes in this
release of the explanation (but not in the test itself).


First, the test procedure...

  1. Compile two versions of the test program, one static, one shared.  To
     compile the static case use the following command:

       gcc -O2 -static -nostartfiles -o <static> /usr/lib/scrt0.o <source>.c

     (By default crt.o is used, which is the shared library startup file and
     causes useless code to get built in.  This is a tightly controlled
     experiment.)  The shared case is simply:

       gcc -O2 -o <shared> <source>.c

  2. Reboot the system, and open a second VT (you do have virtual terminal
     support compiled into your kernel).  Start 'systat vmstat', and note down
     the initial vm statistics.  Pay attention to wired, active, inactive and
     free page counts, and the "all total" page count.

  3. Switch back to the first VT, and start a & instance of <static>.  Switch
     to the second VT, wait for the numbers to stabilize, and note the same
     set of page counts.

  4. Repeat step 3 several times.

  5. Start over from step 2, but substitute <shared>.

  6. Evaluate the page count behavior for the two cases.  For the i386, after
     factoring out the kernel process management overhead (about 6 wired
     pages), you will discover that each new <static> instance caused 2 pages
     to move from the free list to the active list (1 data page and 1 stack
     page), while each new <shared> instance caused some 8 pages to move from
     the free list to the active list.


Now, my own sample results...

test system:  1.2/i386 with 24 MB
    daemons:  inactive lpd

           <static>      <shared>       ld.so

file:        12K            8K           48K

size:      8K text       4K text       44K text
           4K data       4K data        4K data
           0K bss        0K bss         4K bss

  ps:      8K text       4K text
          16K virtual   12K virtual
          24K RSS      188K RSS

<static>    initial     first         second         third
    wired:   1968K   1992K(+24K)    2020K(+28K)   2044K(+24K)
   active:   1912K   1932K(+20K)    1936K(+4K)    1944K(+8K)
 inactive:    768K    824K(+56K)     828K(+4K)     828K(==)
     free:  18516K  18416K(-100K)  18380K(-36K)  18348K(-32K)
all total:   4724K   4824K(+100K)   4860K(+36K)   4892K(+32K)

<shared>    initial     first         second         third
    wired:   1968K   1992K(+24K)    2020K(+28K)   2044K(+24K)
   active:   1912K   1984K(+72K)    2044K(+60K)   2108K(+64K)
 inactive:    768K    824K(+56K)     828K(+4K)     828K(==)
     free:  18516K  18364K(-152K)  18272K(-92K)  18184K(-88K)
all total:   4724K   4876K(+152K)   4968K(+92K)   5056K(+88K)

Things tend to stabilize with the third instance in both cases, with
consistent +/- figures thereafter.  Compare the increase in the active
page count in the third instance.


And finally, the ready-to-compile test program...

-------CUT HERE---------------------------------------CUT HERE---------------- 

#include <sys/types.h>
#include <unistd.h>


#define NONE   0
#define TRUE   1


int main(int argc, char *argv[])
  {
  pid_t pid=getpid();

  while(TRUE)
    { sleep(5); }

  return(NONE);
  }

---------CUT HERE-------------------------------------CUT HERE----------------


Once again, I welcome constructive argument (I really do).


--Jim Howard  <jiho@mail.c-zone.net>

----------------------------------
E-Mail: jiho@mail.c-zone.net
Date: 18-Mar-98
Time: 13:40:25

This message was sent by XFMail
----------------------------------