tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: Constant Database Support



On Sat, Mar 06, 2010 at 07:44:05PM +0100, Alistair Crooks wrote:
> 1. seems to me this could be used for many things besides /etc/services,
> as a write-once, read-many, involatile database.

I intend to use it for many applications, /etc/services and terminfo are
just the most prominent examples in terms of size.

> 3.  even for volatile ones, adding the entry, recalculating the
> perfect hash, dropping the old table, and building a new one might be
> a win.

I consider a fast-to-rebuild read only database a good alternative to
transaction-safe bdb.

> 4. this might be useful for things - and yet we have mostly a bdb
> interface to everything.  as such, moving over to cdb - and there must
> be a better name like ''perfectdb'' or something - would involve pain
> as things were re-written to use the new interface.  i think you
> should consider providing a drop-in bdb1 interface for it, and
> accessing it on a DB_PERFECT style of thing.  index value can be
> passed in the size field of the key DBT, and i can't see much speed
> being lost for that.

I was thinking about providing a wrapper on top of the primitive
interface. The basic questions is whether or not GET on the in-memory
copy should be allowed or not. At the moment there is no direct query
interface for the writer nor does it support updates of keys or records.
The reasons why I didn't consider it too useful is that db.h interface
makes it hard to get minimal code pollution, so essentially using dbopen
always pulls in the full family. At the moment I think we have pwd, dev
and services from within libc that need access, so it would actually
allow dropping quite a bit of code for static linkage and/or embedded
use.

The other reason for using the interface directly is of course to make
use the additional functions. The support for many-to-one mappings can
help for the various users where aliases are supported, but not too
large.

Of course, this would be more useful if dbopen(3) actually supported
type sensing :)

> 5. in the whole scheme of things, access speed is the thing we need
> to measure, i'd have thought? build speed is important, especially
> at boot time on embedded platforms, but that's a one-off.

What case do we want to measure specifically? For the memory hot case it
consistently performs betters. For the cold case it depends on the
database size how many read accesses are needed at most. E.g. the read
code can be summarized as:
1. Read the 24 Bytes header.
2. Compute a hash of the data
3. Access 1..4 Bytes from three different locations in the hash area
following the header. I am consider a flag to specify that cdbr_open
should issue a MAVD_WILLNEED for the area.
4. Access 1..4 Bytes to find the offset of the actual data.
5. Return the pointer computed from the offset.

(6. Validate that the record matches the desired key)

Joerg


Home | Main Index | Thread Index | Old Index