Subject: Re: Generic Properties
To: None <cgd@broadcom.com, eeh@netbsd.org>
From: None <eeh@netbsd.org>
List: tech-kern
Date: 09/27/2001 19:16:11
| eeh@netbsd.org writes:
| > I want `opaque_t' to be large enough to hold any scalar type.  Come to
| > think about it, I suppose I should just bite the bullet and make it an
| > `int64_t'.
|
| technically, i think 'large enough to hold any scalar type' would be
| intmax_t or something like that... but that's not fixed-size.
| (u)int64_t is probably the right thing for this.
|
|
| > | > 2.1	Property Types
| > | > 
| > | > PROP_STRING			-- Property is a string.
| > | > PROP_INT			-- Property is an integer value.
| > | > PROP_ARRAY			-- Property is an array of the 
| > | > 				   same type of values.
| > | > PROP_AGREGATE			-- Property is an agregation of 
| > | > 				   different types of values.
| > |
| > | btw, "aggregate" and "aggregation."  (*sigh*)
| > |
| > | > PROP_CONST			-- Property is constant, do not 
| > | > 				   allocate storage for the data.
| > |
| > | It's not stated here, but PROP_CONST is a flag.
| > 
| > PROP_CONST and PROP_AGGREGATE are type flags and encoded in the type
| > field.
|
| i.e., you'll change it to that?  (that wasn't the way it was in the
| source you pointed people to!)

Fine, I'll change that.  Actually, PROP_CONST and PROP_ARRAY should be
flags, and PROP_AGGREGATE should be a type identifier.

| > | OK, so you know something is an array, but how do you know how big the
| > | array elements are?  (a priori knowledge, a.k.a. magic, bad. 8-)
| > 
| > We've already gone over this several times before.  Properties are a
| > convenient way for different parts of the kernel to communicate with
| > each other without having to resort to global variables.  It is not
| > designed for communication between different machines, therefore
| > self-describing datastructures are inappropriate.
|
| but, there's a mechanism to export it to user-land, and there's plenty
| of stuff there to help userland print it, right?
|
| e.g. the entire set of types 'int' 'string' etc. you indicated were
| there basically to help userland print the stuff out.
|
| So, if you've got an array of ints, how does userland know what size
| they are?

I suppose little endian machines need to know this type of thing to print
out integral values.  Sigh.  O.K., I'll provide a macro for encoding the
size of the object so userland can figure out how many hex digits to print.

| I don't want self-describing, i want something which is actually
| useful for the stated purpose.  printing out 'array of ints' in
| userland as a string of bytes is ... not so useful.
|
|
| > | I would suggest something like encoding:
| > |
| > | 	* string
| > | 	* int
| > | 	* unsigned int (the distinction nice, for userland printer)
| > | 	* aggregate
| > |
| > | as your basic types, then have 'array' be a flag.
| > 
| > That's what the code does, although I don't think we really want to
| > bother with the signed/unsigned issue.  Better just to print it in hex.
|
| Certainly, right now, neither array nor aggregate are flags:
|
| #define PROP_STRING     1
| #define PROP_INT        2
| #define PROP_ARRAY      3
| #define PROP_AGREGATE   4
| #define PROP_TYPE(x)    ((x)&0x0f)
|
| #define PROP_CONST      0x10
|
|
| > | provide a way to encode int/aggregate size in the 'type' word, and
| > | require it for arrays (size of the integers or aggregate types).
| > 
| > What benefit do you get knowing the size of an aggregate in an array
| > if you don't know the types within that aggregate?
|
| two things come to mind:
|
| * you can print it out as individual structures rather than one large
| binary blob, and/or provide some notion of the array size.
| e.g. (array of %d aggregate elements).  This is just eye-candy, and of
| debatable use, though.
|
| * if userland code grovels it out of the kernel, and thinks it _does_
| know what the types within the agregate are, it can use the size to
| sanity check its notion.  I think this is actually a fairly useful
| notion.
|
|
| > | array of strings would be disallowed.
| > 
| > Don't see why that should be the case.
|
| what does it mean?  since they don't have an obvious fixed size,
| they're not strictly speaking an array...  (or do you mean, string
| pointer and array of string pointers... i don't think so?)

No, an array of strings is a buffer full of NUL terminated strings
up to the length of the buffer.  So you go traverse the buffer, 
looking for NULs, and set your pointers 1 byte beyond that until
you run out of buffer.

|
|
| > I am going to wait until there is some sort of consensus before making any
| > name changes.
|
| "OK."  (On the other hand, I don't get why you _initially_ implemented
| them like this, when, after the previous round i'd have expected you
| to know better, but "whatever.")
|
|
|
| > | I'm thinking that maybe 'setprop' should reject an attempt to change
| > | an existing property's type.  not sure tho.
| > 
| > Once again, I don't see why the database code should be setting policy.
|
| That seems reasonable.  On the other hand, can you think of real cases
| where you _would_ want the type to change at run time?

Switching from a single integer to an array of integers, or the other
way around, is certainly understandable.

| > | am i correct in thinking that an object exists iff there are
| > | properties associated with it?
| > |
| > | (that nature should be mentioned someplace...)
| > 
| > Objects do not exist.  Objects have relationships with properties.
|
| OK.  (I don't think that was previously clear.)
|
|
| > | Would you ever want to copy _between_ two databases?
| > 
| > Different databases are different data domains.  Would you ever copy between 
| > a proc and a vnode?
|
| some things, yes.  e.g. uid/gid -> uid/gid of newly-created vnodes.  8-)
|
| Is it necessarily true that different databases _must_ have different
| 'types' of contents?  e.g. one could imagine (and this is totally
| constructed 8-) something like:
|
| 	default audio device db:
|
| 		objects: channels: input, output, mixer, ... ?
|
| 		properties associated with those objects:
| 			(who knows, something about audio).
|
| 	audio device a db:
|
| 		same type of thing
|
| 	audio device b db:
|
| 		same thing...
|
| then at init of a device you might want to copy all of its params from
| the default params...
|
| i suppose you could use a single db to do this, but then you need to
| construct object ids out of ptrs + channel numbers, etc.  kinda gross.

In this case I'd say you really wanted to implement this with one database,
and the object identifiers would be pointers to individual channels....

The problem with allowing copying from one database to another is that
the object identifier space is potentially completely different.  Writing
code to do that is quite trivial, but I'm not sure I want to consider what
the implication of allowing that sort of operation would be on the database
abstraction.

|
|
| > | why have 'valoff' rather than 'namelen'?  (i mean, valoff =
| > | namelen (incl nul) + offsetof kip_name, right?)
| > |
| > | name includes a NUL here, right?
| > 
| > Names are NUL terminated.  However, the code may take the liberty of aligning
| > the data, so the end of the string may not correspond directly with
| > the beginning of the value field.
|
| good point.
|
|
| > | > A new KERN_DB MIB is provides the following:
| > | > 
| > | > 	Third level name	Type			Changeable
| > | > 	----------------	------			----------
| > | > 	KERN_DB_ALL		struct kinfo_kdb[]	no
| > | > 	KERN_OBJ_ALL		quad[]			no
| > |
| > | shouldn't that be uint64_t? or int64_t?  8-)
| > 
| > Existing documentation uses `quad' and `quad[]'.
|
| that seems more of a bug than a feature.  'whatever'.  'quad' in BSD
| is (for better or worse) effectively a fixed-size type.

Maybe we should change all references of `quad' to `int64_t' in all the
manpages.

|
|
| > | 	uint64_t x = objid;
| > | 	int mib;
| > |
| > | 	mib[0] = CTL_KERN;
| > | 	mib[1] = KERN_DB;
| > | 	mib[2] = (better name for KERN_OBJ_ALL);
| > | 	memcpy(&mib[3], x, sizeof x);
| > |
| > | In that case, wouldn't it potentially be "fourth and susequent" or
| > | something?  (what if 'int' is 64 bits, or 16? 8-) yeah, a lot of code
| > | breaks, but... 8-)
| > 
| > Things should work fine if `int' is 64 bits.  Lots of things will break
| > if it's 16 bits, so that's not really worth discussing.  However,
| > the notion of arbitrary width identifiers does cause problems with 
| > KERN_GETPROP, KERN_SETPROP, and KERN_DELPROP, where you have two 
| > separate 64-bit values that need to be encoded in the MIBs.  Determining
| > the beginning of the second value becomes an interesting problem.
|
| perhaps the right answer here is (assuming the desire is to keep
| sysctl's 'mib' a set of integers) to redefine them to be uint32_t's or
| int32_t's...
|
| I think that'd solve both problems, w/ no compatibility issues for
| current platforms.

That's a good idea, but outside the scope of this.

|
| Actually, i note that your kernel code does:
|
|         /* Get database ID.  I wish sysctl used 64-bit ints. */
|         val = (((u_int64_t)name[2])<<32)|name[3];
|
| that has endianness issues, does it not, unless you specify that the
| 64-bit int is to be passed in high bits in the first part of the mib
| value.  I.e., you _cannot_ portably use the memcpy approach i
| mentioned above.  What is the correct way of doing this?  (it needs to
| be well-defined.)

I prefer to have explicit conversions.  They cause less problems in the 
long run than passing blocks of memory.  I'll change the wording to define 
it as explicitly big-endian.  I suppose providing a set of macros for this 
operation would be a good idea.

| > | actually, that makes me wonder:
| > |
| > | in-kernel function to come up with the list of property DBs, for
| > | orthogonality?  or just punt (probably make it static) since it's
| > | probably only going to be used inside the module?
| > 
| > I suppose we could have an in-kernel function to look up databases by
| > name if it's really necesary, but I don't think that will be the case.
| > Since each database should be for a specific information domain, if you
| > can't figure out how to get to the database, you probably shouldn't have
| > access to the information.
|
| Hmm.  Not sure i Truly Believe.  to my mind the question goes
| something like:
|
| is it better to have things like this accessed via global variables,
| or via function calls (that take some kind of key, in this case name)?
|
| there's something to be said for either way.
|
| It's certainly something that can be put off to the future.

Eduardo