Subject: Re: nore on disk stats
To: None <dennis@Ipsilon.COM>
From: Charles Hannum <Charles-Hannum@deshaw.com>
List: tech-kern
Date: 11/16/1995 14:51:20
   I sympathize with the objective, but cringe a bit at the thought of
   the implementation.  Some of my biases come from trying to build routers
   around this code, so it isn't trying to be SNMP-centric which bothers
   me.  But SNMP does have its limitations, which aren't important when
   you are using it to turn a box on an NMS red when something breaks but
   which seem to me to be poison when used as a kernel interface, for
   example:

   (1) the lack of atomicity when reading a chunk of data which is larger
       than the amount a single SNMP query can accomodate,

   (2) the total lack of atomicity when reading a table,

We can't guarantee atomicity for large transfers anyway.  Note:

1) This would imply that the table is locked for the duration of a
read.  This is impractical for things like network and other I/O
transaction tables, for performance reasons.  It's not just
impractical, but completely unacceptable, for things like process
tables, because if something goes wrong you'd never be able to
diagnose it.

2) Neither the current tools nor any competing proposal guarantees
atomicity.

It's long been known and accepted that ps(1), netstat(1), and similar
tools only get a snapshot of the current state, which may be
inconsistent, and are not guaranteed to be reliable.  They are still
useful, though.

   (3) the need to support getnext operations for all tables (sometimes this
       is easy, but other times it is unnecessarily hard),

Almost all tables in the kernel are either flat, linked lists, or
trees.  For all of these, implementing the `getnext' operation is
fairly trivial.

   (4) the fact that you are going to have difficulty doing things which the
       corresponding SNMP MIB didn't consider you might want to do.

No; you just have to extend the MIB.

   I can think of some examples of all of these.  For netstat(1), or some
   other interested piece of software, to read the kernel routing table
   now requires about 3 system calls: a sysctl(2) to find out the size of
   the thing, a call to sbrk()/mmap() to acquire the (possibly very large
   chunk of) memory, and another call to sysctl(2) to fetch an atomic snapshot
   of the table.

First of all, this is an oversimplification.  There are actually a few
system calls done per route. as you can see just by looking at the
function p_rtentry() or ktrace output.

Secondly, even if the above weren't the case,, the snapshot is not
atomic.  There is no lock on the tables, and no lock on the memory
they're being copied to, and network interrupts are not blocked while
the table is copied.  As I said above, this would be undesirable for
performance reasons.

   I don't mind much that my SNMP query tools are
   constrained by this, but it is unacceptable that the kernel interface be
   similarly constrained.

You're making the same mistake someone else did; substituting
`straightjacket us' or `constrain' for my word `conform'.  I never
said we should implement only the existing SNMP MIB; that would be
foolish.

   You do need to support
   getnext operations to read tables.

See above.

   In any case, while making operations where it makes sense more SNMP-like
   would be fine, particularly where SNMP is a frequent consumer, I think
   there are a lot of cases where SNMP just gets in the way.  I'd rather
   have the flexibility to do what is right, given a knowledge of how the
   most frequent non-SNMP consumers of the data use it if they are important,
   rather than to be limited to SNMP's one-size-fits-all constraints, standard
   or not.

I don't see how this is anything but an adverse reaction to my use of
the term `SNMP'.  I'm using SNMP as a transport mechanism for
information, and as such, it does not restrict the information we can
make available.  Implementing dozens of ad-hoc solutions (which is
what we currently do) is clearly a much worse scenario.