Subject: Re: RFC: /kern/summary
To: None <perry@piermont.com>
From: Brian C. Grayson <bgrayson@marvin.ece.utexas.edu>
List: tech-kern
Date: 03/10/1999 00:22:37
On Tue, Mar 09, 1999 at 07:34:59PM -0500, Perry E. Metzger wrote:
> 
> "Brian C. Grayson" <bgrayson@orac.ece.utexas.edu> writes:
> >   I got no negative responses to my query regarding /kern files,
> > so I plan on adding a new file to /kern.
> 
> My main comment: right now, for whatever reason, local culture frowns
> on kernfs as a required mounted file system.

  It wouldn't be required.  My hope is that applications using
/kern/summary (or /kern/stats or whatever name it ends up
getting!) would require fewer system calls, and run faster,
than applications that require multiple kvm_ ops, and thus
multiple context switches.  If I can achieve this, then these
stats programs will have less of an impact on the system, thus
improving the accuracy of the stats, and perhaps making it
One Teensy Bit More Worthwhile to mount /kern and /proc.  And it
could avoid that whole dratted "proc size mismatch."

  But regardless, there is enough inertia that I don't think we
will ever be able to completely remove _any_ of the setgid-kmem kernel
groveling programs, for better or worse.

  I'm just proposing some cleaner, non-setgid methods, for
those that would like to make more efficient, more useful use of the
power of /kern and /proc (and maybe tighten security a little
bit by allowing the disabling of setgid-kmem on ps and top and
xosview without losing their functionality).  I may fail in
this attempt, but I think it's worth a try!

> > 	This
> > 	allows one to do a read() of this amount, and hopefully
> > 	grab an atomic snapshot in a single syscall, rather
> > 	than reading the first 1024 bytes of the file, then
> > 	later reading the second 1024 bytes, leading to bogus
> > 	values at the buffer boundary due to the change in stats
> > 	between the two calls.
> 
> Ew! This totally violates the notion of least surprise. If you open
> the file, your instance of it should never change out from under you
> until you do something to allow that to happen like closing it.

  Maybe.  But the way I think of it is, the files in /proc and
/kern are just like a file that is mmap'd (shared, not private)
by two processes.  If one updates the file, the other one sees
it right away, without closing and reopening (or munmap'ing and
mmap'ing).  These are volatile critters, like accessing a
buffer that is currently being DMA'd, or reading the tick
counter on the timer chip, or peeking at the bits that are on
an Ethernet wire at a particular moment, or looking at shared
memory, or ....

  The current procfs is ``broken,'' according to your desire above:
currently, if you pread /proc/<n>/status, sleep 5 seconds, and
then pread it again (at offset 0), the values may have
changed.  To me this is a feature, not a bug!  Information
updates take a single pread syscall, instead of a close, open, read
triple.

> As I note, a lot of people seem to dislike this idea. You should
> discuss it more widely before moving forward. Not everyone reads
> tech-kern every day.

  Thanks for the comments.  I guess I'll hold off for a week or
more before doing much more work on it, to collect more feedback.

  Brian