Subject: kernel panic while running top
To: None <port-alpha@NetBSD.ORG>
From: Juergen Weiss <weiss@Uni-Mainz.DE>
List: port-alpha
Date: 01/05/1998 16:11:48
In NetBSD-1.3_BETA I observed the following panic while
running top:

#2  0xfffffc0000262b34 in panic () at ../../../../kern/subr_prf.c:150
#3  0xfffffc0000391fbc in trap () at ../../../../arch/alpha/alpha/trap.c:527
#4  0xfffffc0000230428 in XentMM ()
    at ../../../../arch/alpha/alpha/locore.s:377
#5  0xfffffc000025e680 in sysctl_doproc ()
    at ../../../../kern/kern_sysctl.c:649
#6  0xfffffc000025db64 in kern_sysctl () at ../../../../kern/kern_sysctl.c:270
#7  0xfffffc000025d744 in sys___sysctl () at ../../../../kern/kern_sysctl.c:159
#8  0xfffffc0000392200 in syscall () at ../../../../arch/alpha/alpha/trap.c:641
#9  0xfffffc0000230488 in XentSys ()

The allproc.lh_first, p_list.le_next chain is undamaged,
but the variable p contains

p p
$6 = (struct proc *) 0xdeadbeefdeadbeef


My reasoning is as follows:

It seems, that there is a race condition in the sysctl_doproc
function in kern_sysctl.c

The functions loops over the process table and copyout's some
infos. I think that copyouts may lead to pagefaults (for the
user pages) und process switches.  Thus changes to the process
table may happen asynchronously during the loop. This may lead
to p_list.le_next pointers to nowhere.

Is this correct?

Jürgen Weiß

-- 
Juergen Weiss		| Universitaet Mainz, Zentrum f"ur Datenverarbeitung,
weiss@uni-mainz.de      | 55099 Mainz, Tel: 06131/39-6361, FAX: 06131/39-6407