Subject: kernel debugging with /proc & /kern (was: word processor that runs on NetBSD/i386? (FAQ?))
To: None <port-i386@NetBSD.ORG>
From: Greg A. Woods <woods@most.weird.com>
List: tech-kern
Date: 06/24/1998 02:19:19
[[ moving to tech-kern? ]]

[ On Mon, June 22, 1998 at 20:18:27 (-0700), Jason Thorpe wrote: ]
> Subject: Re: word processor that runs on NetBSD/i386? (FAQ?) 
>
> So... this seems to come up often enough that it should be a FAQ,
> but here it goes:
> 
> 	1.  You have to keep the kmem parts of the programs around so you
> 	    can debug crash dumps.  This is irrelevant for Linux, since they
> 	    don't even HAVE crash dumps.

That's bogus.  There are ample tools for analyzing crash dumps without
needing to use netstat/ps/pstat/vmstat et al.  Those latter tools might
be useful for a non-kernel hacker to use for such analysis, but in my
experience these tools are not ideal for use by kernel hackers and only
get used because of a lack of truely suitable tools.  They often abstract
their data too much from the actual kernel data structures and usually
also omit critical elements of internal data structures to be of any
advanced use.

As has already been mentioned it should also be possible to build the
/kern and /proc filesystems in some form in userland and attach them to
core dumps.

At one time I would have liked a tool like AT&T SysV's crash(8), though
as I've learned more about the abilities of gdb I've realized that much
(all?) of what crash(8) does can be done by with gdb macros, and if/when
guile or some other scripting language gets imbedded then such things
will only become easier.  Working directly with data structures and
pointers, with only a narrow abstraction and with the ability to walk
down tables and lists, is much more appropriate for kernel debugging.

Anyone who has ever spent much time inside crash(8) will likely never
have used ps(1) et al to examine kernel dumps.  I certainly never did in
over six years of SysV kernel debugging and driver development.  In fact
I was always "surprised" to see '-n' and '-m' or similar options in
these tools.  Sufficiently advanced gdb macros will make crash(8) look
like a primitive child's toy.

Of course somthing like crash(8) really needs to be re-compiled every
time a kernel is config'd, at least if it were used with the BSD kernel,
and unfortunately the BSD build model doesn't normally allow for this,
but that's something that shouldn't be too hard to fix.  Maintaining
such a tool is yet another problem, of course, but centralizing all this
functionality in just one tool and one or two filesystems (which have
the tremendous advantage of actually running as part of the kernel), is
much better than spreading it through many user-land tools.

If, and only if, libkvm contained all the /dev/kmem grubbing code would
the current state of affairs be anywhere close to being on par with a
well designed and used filesystem interface.  However clearly a debugger
based solution to crash analysis is far superior to both ps(1) et al and
to crash(8), or indeed to any kmem grubbing tools.

> 	2.  If the data from the /proc/... files is binary, you still have
> 	    structure version skew problem that kmem has.

Well, some is (/proc/*/mem clarly is), but most need not be
(eg. /proc/*/regs could be non-binary).  Obviously some serious design
thought needs to be expended to get this right, but it's clearly not
impossible to do, and instead of having to rebuild all of the kmem
grubbing tools with every minor change, they may only need to be
upgraded with every 1.0->2.0 upgrade, and NetBSD hasn't even had one of
those yet!  ;-)

Doing the /proc and /kern interfaces "right" is where this switches from
being merely a religious argument to being one of great practicality.
(The "everything as a file" view of the world didn't become so popular
just by accident!)

Though it might be argued that user-land skew with /dev/kmem hackers is
less of an issue with a system distributed in source code, I would beg
to differ.  The netbsd-current archives alone will provide ample
evidence of lost time, confusion, and general hassles related to such
problems.

> 	3.  If the data is string format, you have the problem that you're
> 	    forcing the data to be represented in a certain way, and
> 	    you can't easily change it.  (If you want to represent differently,
> 	    then you have to parse strings, which is slow and has the same
> 	    version skew problem that binary data does!)

I don't buy the "slow" argument -- maybe it would be a bit of a hit on a
VAX 11/750 (or even on a 780), but not much on a 16 MHz 68020 or better.
My past experience with implementations of ps(1) that use /proc is that
they are often an order of magnitude or more faster than their kmem
counterpart and thus even on a VAX they'd be a significant improvment.
(this experience was with the SysVr4 variant of /proc on a 3b2 where the
previous ps(1) didn't even have to grub through swap to find the
command-line since SysV keeps the first 80 characters in struct proc!).

-- 
							Greg A. Woods

+1 416 443-1734      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>