Subject: Re: Generic Properties
To: None <eeh@netbsd.org>
From: None <cgd@broadcom.com>
List: tech-kern
Date: 09/27/2001 10:52:07
eeh@netbsd.org writes:
> I want `opaque_t' to be large enough to hold any scalar type.  Come to
> think about it, I suppose I should just bite the bullet and make it an
> `int64_t'.

technically, i think 'large enough to hold any scalar type' would be
intmax_t or something like that... but that's not fixed-size.
(u)int64_t is probably the right thing for this.


> | > 2.1	Property Types
> | > 
> | > PROP_STRING			-- Property is a string.
> | > PROP_INT			-- Property is an integer value.
> | > PROP_ARRAY			-- Property is an array of the 
> | > 				   same type of values.
> | > PROP_AGREGATE			-- Property is an agregation of 
> | > 				   different types of values.
> |
> | btw, "aggregate" and "aggregation."  (*sigh*)
> |
> | > PROP_CONST			-- Property is constant, do not 
> | > 				   allocate storage for the data.
> |
> | It's not stated here, but PROP_CONST is a flag.
> 
> PROP_CONST and PROP_AGGREGATE are type flags and encoded in the type
> field.

i.e., you'll change it to that?  (that wasn't the way it was in the
source you pointed people to!)


> | OK, so you know something is an array, but how do you know how big the
> | array elements are?  (a priori knowledge, a.k.a. magic, bad. 8-)
> 
> We've already gone over this several times before.  Properties are a
> convenient way for different parts of the kernel to communicate with
> each other without having to resort to global variables.  It is not
> designed for communication between different machines, therefore
> self-describing datastructures are inappropriate.

but, there's a mechanism to export it to user-land, and there's plenty
of stuff there to help userland print it, right?

e.g. the entire set of types 'int' 'string' etc. you indicated were
there basically to help userland print the stuff out.

So, if you've got an array of ints, how does userland know what size
they are?

I don't want self-describing, i want something which is actually
useful for the stated purpose.  printing out 'array of ints' in
userland as a string of bytes is ... not so useful.


> | I would suggest something like encoding:
> |
> | 	* string
> | 	* int
> | 	* unsigned int (the distinction nice, for userland printer)
> | 	* aggregate
> |
> | as your basic types, then have 'array' be a flag.
> 
> That's what the code does, although I don't think we really want to
> bother with the signed/unsigned issue.  Better just to print it in hex.

Certainly, right now, neither array nor aggregate are flags:

#define PROP_STRING     1
#define PROP_INT        2
#define PROP_ARRAY      3
#define PROP_AGREGATE   4
#define PROP_TYPE(x)    ((x)&0x0f)

#define PROP_CONST      0x10


> | provide a way to encode int/aggregate size in the 'type' word, and
> | require it for arrays (size of the integers or aggregate types).
> 
> What benefit do you get knowing the size of an aggregate in an array
> if you don't know the types within that aggregate?

two things come to mind:

* you can print it out as individual structures rather than one large
binary blob, and/or provide some notion of the array size.
e.g. (array of %d aggregate elements).  This is just eye-candy, and of
debatable use, though.

* if userland code grovels it out of the kernel, and thinks it _does_
know what the types within the agregate are, it can use the size to
sanity check its notion.  I think this is actually a fairly useful
notion.


> | array of strings would be disallowed.
> 
> Don't see why that should be the case.

what does it mean?  since they don't have an obvious fixed size,
they're not strictly speaking an array...  (or do you mean, string
pointer and array of string pointers... i don't think so?)


> I am going to wait until there is some sort of consensus before making any
> name changes.

"OK."  (On the other hand, I don't get why you _initially_ implemented
them like this, when, after the previous round i'd have expected you
to know better, but "whatever.")



> | I'm thinking that maybe 'setprop' should reject an attempt to change
> | an existing property's type.  not sure tho.
> 
> Once again, I don't see why the database code should be setting policy.

That seems reasonable.  On the other hand, can you think of real cases
where you _would_ want the type to change at run time?


> | am i correct in thinking that an object exists iff there are
> | properties associated with it?
> |
> | (that nature should be mentioned someplace...)
> 
> Objects do not exist.  Objects have relationships with properties.

OK.  (I don't think that was previously clear.)


> | Would you ever want to copy _between_ two databases?
> 
> Different databases are different data domains.  Would you ever copy between 
> a proc and a vnode?

some things, yes.  e.g. uid/gid -> uid/gid of newly-created vnodes.  8-)

Is it necessarily true that different databases _must_ have different
'types' of contents?  e.g. one could imagine (and this is totally
constructed 8-) something like:

	default audio device db:

		objects: channels: input, output, mixer, ... ?

		properties associated with those objects:
			(who knows, something about audio).

	audio device a db:

		same type of thing

	audio device b db:

		same thing...

then at init of a device you might want to copy all of its params from
the default params...

i suppose you could use a single db to do this, but then you need to
construct object ids out of ptrs + channel numbers, etc.  kinda gross.


> | why have 'valoff' rather than 'namelen'?  (i mean, valoff =
> | namelen (incl nul) + offsetof kip_name, right?)
> |
> | name includes a NUL here, right?
> 
> Names are NUL terminated.  However, the code may take the liberty of aligning
> the data, so the end of the string may not correspond directly with
> the beginning of the value field.

good point.


> | > A new KERN_DB MIB is provides the following:
> | > 
> | > 	Third level name	Type			Changeable
> | > 	----------------	------			----------
> | > 	KERN_DB_ALL		struct kinfo_kdb[]	no
> | > 	KERN_OBJ_ALL		quad[]			no
> |
> | shouldn't that be uint64_t? or int64_t?  8-)
> 
> Existing documentation uses `quad' and `quad[]'.

that seems more of a bug than a feature.  'whatever'.  'quad' in BSD
is (for better or worse) effectively a fixed-size type.


> | 	uint64_t x = objid;
> | 	int mib;
> |
> | 	mib[0] = CTL_KERN;
> | 	mib[1] = KERN_DB;
> | 	mib[2] = (better name for KERN_OBJ_ALL);
> | 	memcpy(&mib[3], x, sizeof x);
> |
> | In that case, wouldn't it potentially be "fourth and susequent" or
> | something?  (what if 'int' is 64 bits, or 16? 8-) yeah, a lot of code
> | breaks, but... 8-)
> 
> Things should work fine if `int' is 64 bits.  Lots of things will break
> if it's 16 bits, so that's not really worth discussing.  However,
> the notion of arbitrary width identifiers does cause problems with 
> KERN_GETPROP, KERN_SETPROP, and KERN_DELPROP, where you have two 
> separate 64-bit values that need to be encoded in the MIBs.  Determining
> the beginning of the second value becomes an interesting problem.

perhaps the right answer here is (assuming the desire is to keep
sysctl's 'mib' a set of integers) to redefine them to be uint32_t's or
int32_t's...

I think that'd solve both problems, w/ no compatibility issues for
current platforms.


Actually, i note that your kernel code does:

        /* Get database ID.  I wish sysctl used 64-bit ints. */
        val = (((u_int64_t)name[2])<<32)|name[3];

that has endianness issues, does it not, unless you specify that the
64-bit int is to be passed in high bits in the first part of the mib
value.  I.e., you _cannot_ portably use the memcpy approach i
mentioned above.  What is the correct way of doing this?  (it needs to
be well-defined.)


> | actually, that makes me wonder:
> |
> | in-kernel function to come up with the list of property DBs, for
> | orthogonality?  or just punt (probably make it static) since it's
> | probably only going to be used inside the module?
> 
> I suppose we could have an in-kernel function to look up databases by
> name if it's really necesary, but I don't think that will be the case.
> Since each database should be for a specific information domain, if you
> can't figure out how to get to the database, you probably shouldn't have
> access to the information.

Hmm.  Not sure i Truly Believe.  to my mind the question goes
something like:

is it better to have things like this accessed via global variables,
or via function calls (that take some kind of key, in this case name)?

there's something to be said for either way.

It's certainly something that can be put off to the future.


cgd