Subject: Re: Size of static binaries
To: Boris Gjenero <bgjenero@undergrad.math.uwaterloo.ca>
From: David Brownlee <abs@anim.dreamworks.com>
List: port-vax
Date: 02/21/1998 23:44:39
	More on compiling a static version of cat under NetBSD, but first
	the conclusion, for those who do not want to read the whole posting..

	The perceived problem is the size of trivial binaries on NetBSD/vax.

	There are actually two problems,

	     a) No shared libraries. Shared libraries lower the overall disk
	       	usage _and_ in core memory footprint of a multiuser system,
		even when compared to a statically linked system which has
		been stripped down and provides much less functionality.

	     b) Trivial binaries provide the same error reporting (including
	     	national language support), yp support (if they look up a
		username or host), etc as any other binary. The size of this
		code when statically linked is disproportionately large
		compared to the code that actually does the work.
		On non trivial programs much of this extra code is needed to
		provide the core functionality anyway, so the cost is even
		lower.

		(On a dynamically linked binary all this is irrelevant).

	One point to note, with demand paging only the pages with code
	actually being executed (or data being read) are loaded into memory.
	This is a real win with large shared libraries :)

   Now back to the 'size of cat' saga...

	Linking against a version of libc compiled without
	-DNLS -DYP -DLIBC_SCCS -DSYSLIBC_SCCS -D_REENTRANT
	reduces the size by 8K.

	Note: I'm doing this on NetBSD/sparc which is a RISC box, so vax
	      numbers will be much smaller than this. They would also take
	      a lot longer to obtain :)
	 
	text    data    bss     dec     hex
	2904    0       8       2912    b60	cat.o
	8192    8192    0       16384   4000	cat (dynamic)
	57344   8192    0       65536   10000	cat (static)
	49152   8192    0       57344   e000	cat (static, smaller libc)

	So we saved 8K. Hmmm... hardly worth it... Now we acually
	have to do some work...

	Next stage is to to take the output of 'nm cat' and to
	crossreference with the values from 'size libc.a' to work
	out where the rest of the bloat is coming from...

	Top ten functions:
	    text    data    bss     dec     hex
	    11520   176     80      11776   2e00    strtod.o
	    6456    48      0       6504    1968    vfprintf.o
	    568     296     1584    2448    990     findfp.o
	    2448    0       0       2448    990     errlist.o
	    1680    256     464     2400    960     setlocale.o
	    1304    8       0       1312    520     qdivrem.o
	    1008    8       128     1144    478     malloc.o
	    928     0       0       928     3a0     fvwrite.o
	    848     0       0       848     350     ctypeio.o
	    776     0       0       776     308     sysconf.o
	(plus 78 other entries)

	That finds 44344 bytes, added to the 2912 of cat.o to give us
	47256, leaving 10088 bytes unaccounted for (which is probably
	mostly in the partially used pages at the end of the text and
	data segments).

	The biggest function is strtod(), used by vfprintf() (the next
	biggest) to print floating point numbers. "Floating point numbers
	in cat?" I hear you cry. Well there is just one vfprintf() used
	by all programs calling {s,f,}printf(), and it obviously needs
	to handle floating point numbers. (Remember, in shared libraries
	having two functions where one is a superset of another is a
	waste of space).

	Of course, in the static case of cat, all this vfprintf() crap is
	not really needed. We could rewrite cat to not call fprintf() (when
	displaying line numbers), not call warn() or err() to report any
	errors (they both use vprintf()), and use a special version of
	getopt() which also doesn't use vprintf()... phew

	This would leave us on the sparc with a binary of two text pages, and
	one data, probably around 18K. Again, probably half of this on the vax.

	However, this is a hell of a lot of work for one binary, and we have
	to expect that most non trivial programs will need printf() type
	functionality...

	So, what _is_ practical for saving space on those ol' RD53s?

	a) Get shared libraries working. Particularly for those running X
	   binaries, they are just totally bloated unless dynamically linked.

	b) Produce a new 'lighter' libc without NLS, YP, and anything else
	   you can shave, and link against that. You could probably win a
	   lot here, but its something of a pain if you wanted some of the
	   functionality..

	c) Produce a 'float free' libc for those programs that do not need
	   any floating point code. Instant save in every non float using
	   program, at the cost of compiling another libc, and modifying
	   various Makefiles.

	d) Crunch sets of binaries together. A sort of "poor man's shared
	   libraries". Particularly good if you crunch similar binaries
	   together (eg yp*, mount_*). Rather ugly.. but would work.

	Well.. that explains the 'bloat' in static binaries.

	As a point of reference, the 'newfs_msdos' binary, which builds
	a complete MSDOS filesystem (including bootcode in the first sector)
	is the same size as the 'cat' binary on my NetBSD/sparc system.

		David