Subject: Re: systat(:swap) misbehaving
To: None <thorpej@nas.nasa.gov>
From: Paul Boven <e.p.boven@student.utwente.nl>
List: port-sparc
Date: 01/23/1997 01:41:56
Hi everyone,

> > While compiling something large I was using systat to monitor the use
> > of virtual memory by looking at the :swap display. Alas, in this mode
> > systat never really lives very long, exciting with SIGSEGV's for instance.
> > Usually this is preceded by a "cannot read swapmap: kwm_read: Bad address"
> > generated in fetchswap(). "Uptime" for systat is rarely more than 10
> > minutes while compiling, it seems to last longer when swap isn't being
> > used so heavily. I compiled systat (current sources and includes, current 
> > kernel) with -g, and here is the output from gdb:

(Jason Thorpe)
 
> ...what's probably happening is that the kernel state is changing at
> such a rapid pace that the userland program is reading info which
> is immediately out-of-date... when it goes it get the next piece, it
> may not be there.

fetchswap()    /* from .../usr.bin/systat/swap.c */
{
        int s, e, i;
        s = nswapmap * sizeof(*mp);
        if (kvm_read(kd, (long)kswapmap, mp, s) != s)
                error("cannot read swapmap: %s", kvm_geterr(kd));
/** It either goes wrong here because kwm_read didn't return the right
    amount of data  or a -1 **/

        swapmap = (struct map *)mp;
        if (nswapmap != swapmap->m_limit - (struct mapent *)kswapmap)
                error("panic: swap: nswapmap goof");
        nfree = 0;
        bzero(perdev, nswdev * sizeof(*perdev));
        for (mp++; mp->m_addr != 0; mp++) {
/** Or it goes wrong here, because apparently what mp points to is bogus and
    does not contain a trailing zero, so the for-loop SIGSEGV's  **/ 
                s = mp->m_addr;                 /* start of swap region */
/** etc. **/
 
> Programs that grovel kvm state like this could probably use some extra
> sanity checking in them.

In my humble opinion the groveling is only done on the buffered
version of this data-structure mp, so could it be that kvm_read itself
is not "atomic" enough, so it's data get changed while they are being
returned?

But it would be nice to have the programm not go on chomping on the mp-struct
if the kvm_read failed and the error message was printed. I'm going to try
to sneak an else statement into the kvm_read-part. It would of course be
even nicer if kvm_read simply returned what we asked for.

I have put comments at the two points of failure, as far as I understand
things. I see either the error message, and usually a subsequent SIGSEGV, 
or only the SIGSEGV, each of the SIGSEGV's being the for-loop running wide.

Happy hacking, Paul.
----------------------------------------------------------------------
Paul Boven, <e.p.boven@student.utwente.nl>  PE1NUT  QRV 145.575 JO32KF
  Nothing would get done in the world, if we didn't have insomniacs.
           Or at least, nothing would get done at night. 
----------------------------------------------------------------------