Subject: port-vax/4002: ps(1) causes panic
To: None <gnats-bugs@gnats.netbsd.org>
From: maximum entropy <entropy@vivax.bernstein.com>
List: netbsd-bugs
Date: 08/18/1997 02:16:20
>Number:         4002
>Category:       port-vax
>Synopsis:       Running "ps -auxwww" while doing lots of forks and exits in another session causes panic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 17 23:20:02 1997
>Last-Modified:
>Originator:     maximum entropy
>Organization:
	
>Release:        <NetBSD-current source date> -current supped 8/17/97
>Environment:
	
VAXstation 3100
NetBSD 1.2G (-current, NOT the snapshot, though I think the problem also
exists in the 1.2G snapshot GENERIC kernel)
Home-built libkvm and ps, see below
System: NetBSD vivax.bernstein.com 1.2G NetBSD 1.2G (VIVAX) #0: Sat Aug 16 21:05:40 EDT 1997 root@vivax.bernstein.com:/import/tardis/usr/src/sys/arch/vax/compile/VIVAX vax


>Description:
	
Running "ps -auxwww" while doing lots of forks and exits in another session 
causes panic.
Please note that this ps is not the one from the 1.2G snapshot -- that
one doesn't work at all (proc size mismatch).  This ps was built from
-current sources as follows:
cd /usr/src/share/mk
make install
cd /usr/src
make includes
cd /usr/src/lib/libkvm
make cleandir ; make depend ; make ; make install
cd /usr/src/*/kvm_mkdb
make cleandir ; make depend ; make ; make install
cd /usr/src/*/ps
make cleandir ; make depend ; make ; make install

>How-To-Repeat:
	
Any process which forks and exits a lot should do, e.g. a long-running
make (which is how I first noticed the problem) or this minimal test
program:

vivax# cat forker.c
#include <stdio.h>
#include <sys/errno.h>
extern int errno;
int
main()
{
  int pid;
  for (;;) {
    pid = fork();
    if (pid < 0) {
      perror("fork");
      if (errno == EAGAIN) continue;
      exit(2);
    }
    if (pid == 0) {
      printf("%d\n", getpid());
      exit(0);
    }
    sleep(1);
  }
}
vivax# ./forker
237
238
240
241
[
Around this point, I run "ps -auxwww" is run in another window logged
into the same machine, and ps hangs after printing:
USER       PID %CPU %MEM   VSZ  RSS TT  STAT STARTED       TIME COMMAND
]
242
243
244
245
246
^C
vivaxpanic: pmap_remove: pmap not in pv_table
Stopped at      0x80091903:     *clrl   (r6)
db> trace
db_stack_trace_cmd - addr 80091903, have_addr 0, count ffffffff, modif 801b6a5c
db> cont
syncing disks... done

dumping to dev 1401, offset 230827
dump succeeded
[
reboot system.  savecore doesn't seem to be working on this
machine...if a dump would be helpful I can look into this
]
checking for core dump...
savecore: no core dump
checking quotas: done.
[ ... etc ... ]
vivax# gdb -k /netbsd
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.11 (vax-netbsd), Copyright 1993 Free Software Foundation, Inc...
(no debugging symbols found)...
(kgdb) x 0x80091903
0x80091903 <pmap_remove+293>:   0xc7d059d4

I think send-pr is going to strip the symbolic address above since it's
enclosed in angle brackets (please write "I will refrain from using
inbound signalling" 500 times on the blackboard).  In case it does:
0x80091903 pmap_remove+293:   0xc7d059d4


>Fix:
	
Sorry, don't know.

>Audit-Trail:
>Unformatted: