Subject: Re: new sysctl(KERN_PROC, ...) interface (was: sysinfo(2))
To: Simon Burge <simonb@netbsd.org>
From: Bill Studenmund <wrstuden@zembu.com>
List: tech-kern
Date: 04/16/2000 18:08:47
On Sun, 16 Apr 2000, Simon Burge wrote:

> Recently there was a little talk about limiting the rate of change of
> the size of struct kinfo_proc, primarily motivated by ps(1) complaining
> about 'proc size mismatches' whenever some kernel structures changed
> size.

Eww... I've seen the thread as of today, and I don't like where we're
going with it. :-) The only time we run into these problems is in
-current, which we say will break ps on occasion! :-)

> Recently there was a little talk about limiting the rate of change
> and providing binary compatility with sysctl(KERN_PROC, ...) (and
> kvm_getprocs()).
> 
> My current thinking is to handle this within sysctl() (and
> kvm_getproc*()), by adding a new mib based on KERN_PROC with two
> extra entries:
> 
> 	mib[0] = CTL_KERN;
> 	mib[1] = KERN_PROC2;
> 	mib[2] = op;
> 	mib[3] = arg;
> 	mib[4] = elem_size;
> 	mib[5] = elem_count;
> 
> where op and arg are as they currently are.  elem_size is the size of
> each requested process info structure and elem_count is the number of
> structures requested.  sysctl()'s oldlen would still be the overall
> size (usually elem_size * elem_count).
> 
> Given that the current interface is useless each time something in
> struct proc (and some other structures) change, I was originally
> thinking of renaming the current mib number to KERN_OPROC and have it
> immediately fail.  However the current interface (in kvm_getprocs())
> grabs a complete struct proc from a crash dump if not running on a live
> system which is mighty useful for debugging.  I'm sure there'll also be
> a couple of third party programs around that like kernel diving (lsof
> and skill come to mind).
> 
> The new KERN_PROC2 handler would stuff a relatively fixed size struct
> kinfo_proc2, with any new elements only ever being added to the end of
> the structure.  kvm_getproc2() would also grovel a memory image and fill
> it in so the programs using the new interface will still work on crash
> dumps.
> 
> struct kinfo_proc2 would be a single level structure so there'd be no
> references to any kernel structures that may change size later on.  I'd
> not expect types like pid_t, gid_t, sigset_t and segsz_t to change size
> to often - if they do then we'd add an o<type>_t or similar (perhaps
> local to <sys/sysctl.h>) and fill in as much of the old type as we
> could, and a new element of the new type at the end of kinfo_proc2.
> 
> For each process requested, the handler would memcpy (uiomove or
> whatever) only the first elem_size bytes or each struct kinfo_proc2 to
> the user buffer.  Thus we should be able to have an old ps(1) work on a
> new kernel without complaining about proc size mismatches.

Ok, where is the handler? If it's in the kernel, then I have objections to
parts of this idea. :-)

> Any basic flaws in this line of reasoning so far?  Aidan - I know this
> is not exactly what you had in mind; how much different is it?

I guess my main objection is to the general tone of this idea and also
things which were brought up later in the thread. The main thrust here
is you're trying to make life easy for ps(1) when what ps(1) is trying
to do is hard - It's trying to figure out the process list of a kernel
later than for which it was compiled. The thing I really don't like is the
idea of doing something with sessions to get at the complete process
list. That's trying to make a MIB-based interface work at something it's
not good at.

There have been lots of discussions, and I really think the direct kernel
grovveling approach is the best. Among other things, it puts the onus of
trying to make a list of rapidly changing things on the userland tool
asked to do it.

Another problem with sysctl in general is that it is very compile-time
dependent (like how mib entry text is turned into static numbers
then..) i.e. it's a fairly stodgy interface with isn't adept at dealing
with kernel/userland drift. :-)

Ok, so I really don't like shoving the process list through a MIB, and I
think sysctl isn't good for where the kenrel has drifted relative to the
userland. So can I do anything other than say I don't like it? :-)

I hope so. How about this for an idea:

How about we shove a description of the struct proc contents into the MIB?
While things change in struct proc, I think things like the fact that
there is a user id, a tty, and the big facts about a process don't. Maybe
they get added to, but that's most of it. _Where_ they are does change,
but the fact that they are there doesn't. So just use the MIB to tell ps
where to find things in the memory it reads when kvm grovveling. ??

:-)

So the idea would be that ps would:

1) read MIB entries for the things in struct proc it knew about and cared
about when it was compiled, and figure out how big the current struct proc
entries are.

2) kvm grovvel a bunch of them, and use the info from step 1) to map the
things it grovveled into structures it understands. Then it does the
normal ps stuff, and repeats.


I like this idea much better because it puts the onus of dealing with
drift on ps and the other userland utilities. I.e. rather than teaching
the kernel how to deal with older versions of the interface, we teach the
programs to deal with newer. :-) Routines to do this could even be added
to libkvm. :-)

Plus, it freezes into the MIB the part which will most likely NOT change -
what the fields are in struct proc (like p_flag, p_stat, p_pid, ...). Even
with what you proposed above you had concerns about obsolessences. :-)

Thoughts?

Take care,

Bill