tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: The lamentation of proplib(3)



Le 28/01/2014 22:16, Mindaugas Rasiukevicius a écrit :
The long term objective would be to replace and eliminate proplib(3) from
the tree.  The short to medium term objective is to provide an alternative,
start using it and gradually convert proplib uses.  Yes, we will need to
add compatibility code for the Property List format, which is going to be
very depressive.

Nobody said it is going to be a trivial task.  The riddance is not going
to happen any time soon.  We just have to start somewhere.

Indeed. I think it is better of leaving proplib(3) as it is, it will go out by itself when subsystems are updated on the long run.

Secondly, they are tons of interchange formats out there. libnv is one
more, with its original author stating that it is not really meant as a
replacement for XML/JSON.

The library provides an interface to pack and transport the data.  As far
as the caller is concerned, it does not matter what serialisation format
it uses.

That's the purpose of the lib.

However I disagree for the caller: it does matter, indirectly. A horribly inefficient serialization means that the lib will not get widespread use.

Besides if the serialization format has limitations (no nesting allowed, key unicity, ...), it cannot replace proplib(3) 1:1.

 There is no reason why it could not use JSON or <insert your
favourite format>.  I think the default format should be binary, though.

I agree.

- the error handling is weak IMHO; given the potential large use of such
a library, it should support richer semantics than a blunt errno
(something equivalent to a gai_strerror(3) maybe);

Why?  Most of the use cases in our tree do not really need granularity on
errors - you either retrieve (or construct) the whole thing or you fail.

Well, you have to know /why/ it failed when you construct it. EINVAL is not really informative: duplicate key, depth limit, out of memory, out of bound (for string or int encoding)...

That is why accumulated error is so useful, it would simplify many cases
in our tree.  If we add support for schemas, then the schema validation
code is the routine which could be more informative.

I cannot see how nvlist_error() can carry this information. How is the API supposed to inform the caller that the schema validation code is wrong and not nvl?

- it does not seem to offer a way to serialize kernel shared structures
easily. It is particularly convenient to have because it avoids
user/kernel roundtrips when you want to expose kernel structures without
syscall overhead (instead of playing with ioctl or low-level mmap).

Can you be more specific?

This is probably badly expressed on my part. Two things:

1 - I was thinking about sysctl.

*stat(8) binaries use sysctl(3) to query about structures (io_sysctl, clockinfo, ...) that get shared between userland and kernel. There is no reflection here, the caller has to use the correct structure if it wants to get the proper decoding. Else it ends badly.

An interchange format has to detect decoding mismatches, especially when they pose security/integrity issues (information leak, out of bound values).

2 - regular polling of statistics

Following the sysctl example, in the case of top (but any other stat would do: sysstat, iostat, netstat, ...) the values are regularly updated by copying them from kernel back in userland.

I have met from time to time system-specific APIs to map such values in userland read-only, to avoid pinging back the kernel for their update (Xen iorings, L4 flexpage, can't remember for the others), but there was no library to manipulate them through a higher level interface (for example when you want to pass driver hardware counters). So I had to roll my own. Not difficult when you access atomic-friendly values (integers and such), less so about strings or objects.

Why did they consider rolling out libnv when there are alternatives like
protocol buffers or thrift? Granted, those tools are meant for higher
level langages and RPCs, but if NetBSD managed to use XML in kernel, I
suppose those would fit too...

Google protocol buffers and Apache Thrift work in a different way - they
generate the code for you based on a provided schema, to conveniently and
efficiently implement RPCs.  It is XDR "on steroids" - 1980s technology
refurbished for the modern day (e.g. including some schema versioning,
compression, a bunch of tools, etc).  The libraries we are talking about
merely perform dynamic data serialisation at run-time.  Both approaches
have their merits, but for all intents and purposes we are not going to
shift to a different approach, or rather paradigm, at this point.

I have no experience there.

They bring interesting properties though: compiler checks for the API, optimizations (you can get really compact, efficient structures when you specify upper/lower bounds), and are suitable for RPC. Can become an interesting property for async communications.

Also, proplib uses horrible XML-like looking format, but not XML.  There
is a subtle difference here.

True. This does not make it better anyway. Just looking at the API is enough :)

Cheers,

--
Jean-Yves Migeon


Home | Main Index | Thread Index | Old Index