Subject: Berkeley DB's in /usr/share.
To: None <tech-userlevel@NetBSD.ORG>
From: Simon Burge <simonb@NetBSD.ORG>
List: tech-userlevel
Date: 08/22/1999 20:05:03
Folks,

I'm getting there on making /usr/share completely MI.  Next step is the
Berkeley DB files - there's currently two of these in /usr/share:

	/usr/share/misc/termcap.db
	/usr/share/misc/vgrindefs.db

As per an earlier discussion, I'm going to nuke vgrindefs.db - it
provides fast access to a file less than 8k long, and ends up in the
distributions as a 520kB file.  (Yes Berkeley DB uses sparse files, but
this doesn't help when pax gobbles it up into a distribution set and the
sparse file still uses 296kB of disk anyways.)


For termcap.db, I'm proposing we make it a little-endian Berkeley
DB.  My first thought was to make it big-endian, as BE is the basis
of network-byte ordering and it seemed like "tradition".  However, (I
think) the slowest machines we have are some of the Vaxs, and they're
little endian and every little bit helps those machines :-).

I'm still trying to track down exactly why the alpha (little endian) and
32-bit little-endian files aren't exactly the same. db_dump185 (from the
Berkeley DB distribution) dumps of both these files is the same, and
either can seem use the other without problems.

As a test case, the slowest machine I could find was a DECstation 3100
with a 16MHz R3000.  I've got a little test program which retrieves
the "xterm" entry from a termcap.db using getcap - the inner loop is:

	for (i = 0; i < 1000; i++) {
		cgetent(&buf, vec, "xterm");
		free(buf);
	}

On a little-endian (native) termcap.db file, a average run takes

	1.874u 4.226s 0:08.53 71.3%     0+0k 0+0io 0pf+0w

and for a big-endian termcap.db it takes

	2.040u 4.325s 0:08.82 72.1%     0+0k 0+0io 0pf+0w

so it's taking about 300 microseconds extra to retrieve a non-native
byte-order record.  In a similar test on a 50MHz Sparc (IPC?) it took
about 40 microseconds per record longer and on a PII 400, it took 2
microseconds per record longer.

I've already made changes to cap_mkdb to generate arbitrary endian
databases.  Is there anyone who doesn't like these ideas or who has
any better suggestions?

Simon.