Subject: Re: new sysctl(KERN_PROC, ...) interface (was: sysinfo(2))
To: Simon Burge <simonb@NetBSD.ORG>
From: Bill Studenmund <wrstuden@zembu.com>
List: tech-kern
Date: 04/17/2000 17:58:40
On Mon, 17 Apr 2000, Simon Burge wrote:

> Bill Studenmund wrote:
> 
> > On Sun, 16 Apr 2000, Simon Burge wrote:
> > 
> 
> Now that we do many ps's on startup and shutdown with rc.d, having a
> working ps on -current, no matter how up-to-date userland is, is a good
> thing.  Also, in theory it should be possible to use a 1.5 ps on a 1.6
> kernel (untested of course :-) and so on...

True.. rc.d makes it more important to have them working than in the past.

Though what's up with the .pid files, as I think greywolf asked about? Why
even bother with ps if we don't have ot?

> I guess I'm trying to make it not hard for ps to do that, since it's so
> bloody annoying when it doesn't work :-)  Seriously, I see no reason for
> the kernel not to help userland let it know what's happening, if even
> userland is too old...

I guess part of my objection was the thought that over time different
versions of this proc substitute will come into use, and that the kernel
will need to support all the older versions of this.

Hmmm... But if you made sure to add things at the end, then it mightn't be
so bad...

> > The thing I really don't like is the
> > idea of doing something with sessions to get at the complete process
> > list. That's trying to make a MIB-based interface work at something it's
> > not good at.
> 
> This is _not_ something I was planning to handle, and isn't something
> that my current implementation does.  I'm with you on this one :-)

Woo hoo! :-)

> > Another problem with sysctl in general is that it is very compile-time
> > dependent (like how mib entry text is turned into static numbers
> > then..) i.e. it's a fairly stodgy interface with isn't adept at dealing
> > with kernel/userland drift. :-)
> 
> I'm not sure I buy this - the MIB numbers should not change over time.
> This would absolutely kill binary compatibilty because a number of libc
> functions use sysctl to do their dirty work.

Right. We're locked into place with the MIB numbers. :-) Though we come at
it in opposite directions, we agree. :-)

[idea snipped]

> If I understand what I think you mean, then a lot of extra info would be
> needed in the kernel, like "the field named ``p_pid'' is N bytes from
> the start of struct proc" so that it could be returned to userland.
> We'd also need a way to reference the proc field names.  At the moment,
> there's 60 fields that are carried over to struct kinfo_proc2 and ps(1)
> uses most of then (look at src/bin/ps/keyword.c)...

Yep.

> > I like this idea much better because it puts the onus of dealing with
> > drift on ps and the other userland utilities. I.e. rather than teaching
> > the kernel how to deal with older versions of the interface, we teach the
> > programs to deal with newer. :-) Routines to do this could even be added
> > to libkvm. :-)
> 
> As I said above, I'd like the kernel to help out.  Certainly if things
> were done as you suggested, libkvm whould have to deal with the mess
> otherwise maintainence of all the userland users of the functionality
> would be nightmarish.

True.

But since we want a backwards compatability system, as the structure
changes, we're going to need to support more and more versions of the same
thing. Where should we store that? In a userland library, or in the
kernel?

> > Thoughts?
> 
> My initial thought is that what you're proposing is a lot work!  I'm
> merely revamping an existing interface, while you're proposing an
> entirely new one.  Given that I don't think I understand exactly what
> you're suggesting (look at my paragraph on finding field offsets), I'm
> not all for it at the moment ;)

I think you exactly understand what I'm describing. :-)

> Hmm, one thing occurs to me - getting a process's argv is very much
> attached to the current vm machinery.  What would be nice here (and
> the opposite of what you probably are thinking!) is a nice little
> kernel interface to return the address in either physmem or swap of a
> given process' va.  Then kvm_proc.c:kvm_uread() could be as simple as
> (paraphrasing sysctl):
> 	
> 	sysctl(CTL_KERN, KERN_PROC_VA, pid, &type, &offset);
> 	pread(type == swap ? swapfd : memfd, buf, nbpg, offset);

Actually, I'd agree with the above. It's small, unlikely to change, and
the kernel is much more the canonical arbiter of where a process's argv is
than anything else. :-)

Now that I understand what you're trying to do and not do, I don't think I
mind it soooo much.

Take care,

Bill