tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: proplib and the jet age

proplib inherits its use from property lists from NeXT Step, and then the 
reincanitation of in OS X plist.

The data types matches up with the Foundation/CoreFoundation runtime and makes 
perfect sense in that world.

As you point out the XML representation is poor, that is why there is an binary 
format too.

As for data transfer, the OS X answer is XPC.

Where data is also file descriptors, shared memory and other useful primitives.


31 dec 2012 kl. 21:16 skrev David Holland <>:

> Almost ever since proplib was first imported there's been a steady low
> leveling of grumbling about it, which occasionally erupts into open
> arguments. This has gone on for quite a few years now, with the result
> that the real problems, of which there are several, have by and large
> been sifted out from the casual complaints. There hasn't been any
> concerted attempt to address these problems, though, just occasional
> random flailing (including on my part) that hasn't really gone
> anywhere.
> It seems to me that the basic complaints, reflecting real problems,
> are the following:
> (1) It was never clear whether proplib is supposed to be a data
> *transfer* API (that is, data lives in application data structures and
> gets loaded into and out of proplib only for shipment) or a data
> *storage* API (that is, data lives in proplib and applications use
> proplib to access it). As a result of this proplib has aspects of both
> these things and is a good solution for neither.
> It has become clear in the past few years that NetBSD needs a data
> transfer library. (Writing out to disk and reading back again later is
> a form of this.) Essentially all the extant and serious proposed uses
> of proplib fall into this category. There is no evidence that we need
> another data storage library; we already have db(3), and for cases
> where that's not enough nowadays we have sqlite. Ergo, what we want
> from proplib is data transfer, and any plans for the future should
> take this into account.
> (There is also a demand for tools to handle configuration files;
> this is a different problem for reasons I'll elaborate on at length if
> provoked.)
> (2) The data model is poorly considered. It provides a random
> selection of atom types without any visible coherent plan, and in
> particular it has internal problems with signed vs. unsigned integers.
> Meanwhile the composite types are only very basic (arrays and maps
> that are limited to string -> T) and there is no model for what kinds
> of more complicated structures these can and cannot be assembled into.
> (For example, you can use dictionaries to assemble graphs; but it is
> far from clear what happens if you try to dump out the results.)
> There are several possible more coherent data models that we could
> choose. I'm going to tackle this in more detail below, as it's the
> chief question going forward.
> (3) The API is a mess at the detail level. In addition to being wordy
> and generally cumbersome (hence all the "proppropliblib" jokes), it is
> not only not type-safe but actively type-unsafe in a particularly
> hazardous way, it has weird reference count semantics, and it
> furthermore has unclear error semantics with far too many error cases
> that leave no clear recovery method.
> All of these things can be done better; it is just ("just") API
> design.
> (4) There is no support whatsoever for schemas or any other method of
> specifying or validating what data is supposed to be present in what
> structure. Relatedly, there is no support for handling format version
> information.
> (5) The code is bloaty. The implementation is at least 2x the size it
> needs to be for the functionality it provides.
> (6) The output transfer format is something everyone dislikes (XML)
> which is itself bloaty and space-wasting. Furthermore, there's only
> the one output format.
> (These last two problems can readily be fixed by writing new code.)
> Now.
> It seems to me that the current proplib API is a large part of the
> problem, so any significant changes for the future should probably
> include deprecating it and replacing it with a new API. Note that as a
> result of points 1-3, at least one developer has already written a
> library whose sole purpose is to interface to proplib.
> The chief question, therefore, is what data model the new stuff should
> support. There are at least seven obvious candidates I can think of:
> (a) What we have in proplib (arrays and string-keyed dictionaries)
> only, with the explicit understanding that only tree structures are
> supported and not graphs; that is, no dictionary or array can appear
> more than once.
> (b) Same as (a) but extend dictionaries to be keyable with arbitrary
> atom types.
> (c) A more general semistructured model, like (b) but that explicitly
> allows graph structure without being fully graph-oriented.
> (d) RDF, or more likely a tasteful subset of RDF with data types
> instead of using URIs for everying. (
> (e) Property graphs.
> (
> (f) Property graphs where property values can be tuples rather than
> only atoms.
> (g) Relations (tables of rows with named fields).
> These all have their advantages and disadvantages. (a) and (b) are
> simpler, but are also fairly limited. (c) and (d) are nearly the same
> modulo how much W3C koolaid is involved. (f) rectifies a weakness of
> (e) that I've run into; (c)/(d) and (f) are both supersets of (g).
> The chief difference between (d) and (e) is that (e) is more
> structured; RDF allows assembling contraptions that are not graphs,
> like edges that point to edges.
> I don't think the relational data model is a good choice here. The
> basic question is whether we want to handle graph structure or not.
> There are arguments both ways. The chief argument against is that it
> isn't clear how much there's a real need to ship graph-structured data
> around. One of the big arguments for, however, is that there are
> several preexisting inoffensive transfer formats for graph data (e.g.,
> Turtle for RDF and the graphviz dot format for property graphs) and
> there is no such thing for purely hierarchical data.
> I would say (based on having been dealing with graph-oriented
> semistructured data in my day job for some seven years now) that the
> API-level cost of dealing with graph data is negligible until you get
> into iteration, which is nasty regardless, and the implementation-
> level cost is small, and graph models have the nice property of being
> supersets of everything else. If it were just me, I think my choice
> would be (f).
> However, since there is basically zero chance we want to do graph
> theory on graphs we store in this thing, and we don't necessarily care
> if edges point to edges and so on, (d) might be a better choice if
> someone's willing to filter the W3C RDF koolaid and come up with a
> coherent proposal for a model that doesn't have W3C glop coming out of
> its ears. I can do this if there's demand, but I think property graphs
> are a better choice.
> The primary downside is that to the best of my knowledge schemas for
> graph data are a research topic, although not necessarily a
> particularly difficult one. (If anyone knows otherwise, please let me
> know!)
> On the other hand, at one point several years back during one of the
> proplib arguments I spent a few hours implementing about half of a
> replacement. If someone wanted to finish it, it would only take a few
> more hours probably, and it would be a decent start at a replacement
> proplib using data model (a).
> Opinions please.
> oh, and in case anyone was wondering: ultramarine, with violet and
> cream accents.
> -- 
> David A. Holland

Home | Main Index | Thread Index | Old Index