Subject: RelCache (aka ELF prebinding) news
To: None <tech-userlevel@netbsd.org>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-userlevel
Date: 12/01/2002 14:28:20
Hi folks,

I have finally made the first working implementation of ELF prebinding
"V2" available (I named it "RelCache" ;-). Here's some description on how
it performs well, how it works, how you can install it on your machine,
etc.

Benchmark
*********

A simple benchmark result with execloop shows that ld.elf_so loads
execloop in 30% less time. It was done on an Athlon XP 1800+/256MB DDR
machine.

$ time ./execloop 9999

	w/o RelCache	w/ RelCache
	  5.184s	  3.623s

For comparison, statically linked execloop took 0.967s. (Note: NetBSD's
fork/exec path for dynamically linked binary is rather unoptimized and
needs to be addressed.)

I expect much more gain with running Mozilla/KDE/GNOME with RelCache, but
haven't had time to do that myself.

Stability
*********

For normal, non-RelCached binaries, ld.elf_so works as stable as the current
one.

I had no problem with building GENERIC kernel with RelCached toolchain.
No further testing was performed, however.

Technial details
****************

When environment variable LD_PREBIND is set to some value, RelCached 
ld.elf_so(1) saves relocated segments (and part of .bss section as well, if
needed) to a disk file in /var/db/relcache as follows:

total 640
-rw-r--r--  1 root  wheel  118784 Dec  1 02:01 32471a77a73c871694e426bc1ed0395a
-rw-r--r--  1 root  wheel   32768 Dec  1 02:00 6f0e619c215d700c5a7d1d53b293da88
-rw-r--r--  1 root  wheel   32768 Dec  1 02:00 aa494b7df6268afd5317ce6deaf83a7e
-rw-r--r--  1 root  wheel  118784 Dec  1 02:01 b722a86c497416e587ec11b01f81728f

The next time the same binary is executed, ld.elf_so checks if there are
modifications in objects by comparing md5 checksum, starting address, etc.,
and if not, it reads a RelCache file and mmaps (overlays) it on the 
appropriate memory region.

With RelCached executables and shared objects, there are no data fixup
and relocation performed on process startup, and as a result, things are
loaded (much) faster.

How to use
**********

!! You must be root to perform prebinding !!

First, you should put appropriate md5 string in an executable you want to
"cache" and shared objects it depends on (eventually this should be 
integrated into ld(1) but I haven't succeeded yet). If there's any of
objects which doesn't contain .cksum section and valid checksum, ld.elf_so
won't perform prebinding on the binary.

Say you want to prebind /bin/ls. Type:

# prebind /bin/ls
Writing MD5 (/bin/ls) = 98ba26bb8aa39962576eb91a3b26b385
Writing MD5 (/lib/libc.so.12) = 4abb37378996d666336f5ae20587fc2e

prebind automatically writes md5 strings to required shared objects. md5
string is also used as the name of RelCache file.

Now you should run /bin/ls with LD_PREBIND set:

# LD_PREBIND=1 /bin/ls

If you want to see debug messages, set LD_DEBUG=1 as well.

If everything is okay, /bin/ls will be executed normally and you won't notice
difference. From now on, every user in the system can benefit from saved
RelCache file.

(Todo: executing /bin/ls with LD_PREBIND should be automatically done by
prebind.)

Installation
************

First, you should patch the toolchain so that it put checksum section in
executables and shared objects:

ftp://ftp.netbsd.org/pub/NetBSD/misc/junyoung/toolchain-20021201.diff

After installing it, now build prebind(1) and ld.elf_so(1):

ftp://ftp.netbsd.org/pub/NetBSD/misc/junyoung/ld.elf_so-relcache-20021201.tar.gz
ftp://ftp.netbsd.org/pub/NetBSD/misc/junyoung/prebind-20021201.tar.gz

Note that prebind(1) should reside in /usr/sbin.

For those who don't want to make everything himself, I have uploaded
compiled binaries as well:

ftp://ftp.netbsd.org/pub/NetBSD/misc/junyoung/relcache-20021201.tar.gz

Todo
****

Integrate checksum generation routine into libbfd.so.

Execute binary with LD_PREBIND set by prebind, not manually by user.

Squeeze RelCache size.

Dropped Ideas
*************

Better CPU cache utilization - this was found to be too difficult to implement.

Possible questions to be asked
******************************

"Isn't it another kind of hack?"

As far as I know, no. It doesn't ever try to avoid fixing up specific
symbols as the previous implementation did.

"Can it be used on non-x86 platform?"

Unlike Red Hat prelink, RelCache method can be very easily applied to
non-x86 platform with additional code to Rela/64-bit support. Every ELf
based platform in NetBSD will be able to make use of it.

"Cache files are too big."

I'd say they are not so big on today's cheap 80 GB drives. ;-) Currently
I'm on investigation to squeeze file size.

"Why didn't you implement it as 'one-file' method?"

I'll be back with details on it shortly.

Enjoy,

Jun-Young

-- 
Bang Jun-Young <junyoung@netbsd.org>