tech-kern: Re: RFC: /kern/summary

Subject: Re: RFC: /kern/summary
To: Brian C. Grayson <bgrayson@marvin.ece.utexas.edu>
From: Jukka Marin <jmarin@pyy.jmp.fi>
List: tech-kern
Date: 03/11/1999 07:58:49
On Wed, Mar 10, 1999 at 01:05:40PM -0600, Brian C. Grayson wrote:
> On Wed, Mar 10, 1999 at 03:54:03PM +0100, Frank van der Linden wrote:
> > 
> > I always thought it was kind of backwards to first have the kernel
> > convert data to ASCII, and then later convert it back. It's nice
> > to have things available in ASCII for commandline use and scripts,
> > but I'd like to see it binary as well.
> 
>   The problem with binary format is, if the kernel changes, bang!
> ps/top/systat/xosview are useless until you can upgrade userland.
> I've already coded up changes to ps that allow it to fall back on
> /proc when kvm isn't happy (if /proc is mounted, that is), so
> it works great in practice.  :)

Personally, I hate the idea of having to parse ASCII data (mainly
because it is very inefficient (I have written atoi() functions for
several CPU architectures and done some floating point stuff as well ;)).
Here's one new idea (stolen from the IFF file format of AmigaOS):

Use a "binary" format which contains N iterationss of "<TAG><LENGTH><DATA>".
Here, <TAG> specifies the information type of <DATA> and <LENGTH> tells
the length of the actual data <DATA>.  <DATA> may contain binary data
such as integers or floating point numbers.

What's the point?  The point is that the software parsing the data can
handle all information types it knows of and simply skip over the
chunks it doesn't recognize.  If we added a <TYPE> field in the system,
some parser could even print out the chunks it does not recognize -
using the <TAG> string instead of the "human readable name" of the
object.

<TAG> might be 4 bytes, <LENGTH> another 4 bytes and <DATA> anything
between 0 and 4 GB.  Or, some bits of <LENGTH> could be used for <TYPE>
to specify the type of data (like 32-bit int, 64-bit int, float, double,
string, or unknown).  (Keeping all objects aligned to 4 or 8 byte
boundaries would improve performance on certain CPU architectures.)

No, this format would not be human readable because it would contain
binary data - but it would be much faster to parse and it would still
let different versions of kernels and userland binaries to operate
as well as possible.

I use a similar system in a project I'm working on now..

Comments?

  -jm