Subject: speeding up ld.elf_so
To: None <tech-perform@netbsd.org>
From: David Laight <david@l8s.co.uk>
List: tech-perform
Date: 08/24/2002 21:31:10
Looking a a (rather large) ktrace output I spotted the following
for which ld.elf_so is responsible:

 16853 uname    NAMI  "/usr/lib/libc.so.12"
 16853 uname    RET   __stat13 0
 16853 uname    CALL  open(0x48057080,0,0x1000)
 16853 uname    NAMI  "/usr/lib/libc.so.12"
 16853 uname    RET   open 3
 16853 uname    CALL  read(0x3,0xbfbfd460,0x34)
 16853 uname    GIO   fd 3 read 52 bytes
       "\^?ELF\^A\^A\^A\0\0\0\0\0\0\0\0\0\^C\0\^C\0\^A\0\0\0\0\M^L\^A\0004\0\0\
        \0p\M-X \0\0\0\0\0004\0 \0\^D\0(\0B\0?\0"
 16853 uname    RET   read 52/0x34
 16853 uname    CALL  close(0x3)
 16853 uname    RET   close 0
 16853 uname    CALL  open(0x48057080,0,0)
 16853 uname    NAMI  "/usr/lib/libc.so.12"
 16853 uname    RET   open 3
 16853 uname    CALL  __fstat13(0x3,0xbfbfd4f4)
 16853 uname    RET   __fstat13 0
 16853 uname    CALL  read(0x3,0xbfbfc4c4,0x1000)
 16853 uname    GIO   fd 3 read 4088 bytes
       "\^?ELF\^A\^A\^A\0\0\0\0\0\0\0\0\0\^C\0\^C\0\^A\0\0\0\0\M^L\^A\0004\0\0\
 <snip>
        \134\^C\0\0C\^A\0\0"
 16853 uname    GIO   fd 3 read 8 bytes
       "\240\^B\0\0\M^A\^F\0\0"
 16853 uname    RET   read 4096/0x1000
 16853 uname    CALL  mmap(0,0xa0000,0x5,0x2,0x3,0,0)
 16853 uname    RET   mmap 1208340480/0x4805d000
 16853 uname    CALL  mmap(0x480ea000,0x6000,0x3,0x12,0x3,0,0x8c000)
 16853 uname    RET   mmap 1208918016/0x480ea000
 16853 uname    CALL  mmap(0x480f0000,0xd000,0x3,0x1012,0xffffffff,0,0)
 16853 uname    RET   mmap 1208942592/0x480f0000
 16853 uname    CALL  close(0x3)
 16853 uname    RET   close 0

Note that there are 3 NAMI calls for the shared library, and a 4k read
before the file is mapped.
ISTM that there could easily be a measurable improvement if NAMI was
only done once.  It also removes any possible race conditions caused
by someone moving the file.
Also isn't using mmap() likely to be better than the 4k read?

The start point would be to make _rtld_find_library return an open fd.
if looks as though it is always called:
	path = _rtld_find_library( name, obj );
	if (!path)
		error(...)
	new_obj = _rtld_load_object( path, ... );
So passing through the open fd wouldn't be too hard.

The read(fd, u.buf, PAGESIZE ) is in _rtld_map_object and could trivially
be changed to an mmap.

The only slight difficulty is that there isn't (yet?) a flag to open()
to say 'only open a regular file' (or do you want to be able to
dlopen() anything that supported mmap?)
(O_NONBLOCK might be enough - or just leave it a caveat emptor.)

Thoughts?
I don't fancy testing this one, unless someone can suggest how to
run a program with an alternate loader?

	David

-- 
David Laight: david@l8s.co.uk