Subject: Re: CVS commit: src/sys
To: David Laight <david@l8s.co.uk>
From: Simon Burge <simonb@wasabisystems.com>
List: source-changes
Date: 12/14/2003 12:59:03
David Laight wrote:

> > this change seems to break ktrace.
> > now ktrace_common() calls ktrops(), which might sleep,
> > with proclist lock held.
> 
> Exactly where does ktrops() sleep?
> 
> If it can sleep while traversing the process list, then the process
> it is referencing can exit and the pointer becomes invalid.
> Which would mean that it can't use allprocs to act on all processes.

Here's some traces from a panic:

panic: spinlock_switchcheck: CPU 0 has 1 spin locks
Stopped in pid 11368.1 (ktrace) at      netbsd:cpu_Debugger+0x4:        leave

db{0}> tr
cpu_Debugger(c07dfa20,c06dd8f5,36d,1,0) at netbsd:cpu_Debugger+0x4
panic(c072c200,0,1,c0784920,e8763800) at netbsd:panic+0x121
spinlock_switchcheck(c,1,f0d4b8dc,c036097d,c07798d4) at netbsd:spinlock_switchcheck+0x7b
mi_switch(e8763800,0,1ca,c45fcecc,6) at netbsd:mi_switch+0x6c
ltsleep(c45fcea4,11,c06df5cc,0,c45fcecc) at netbsd:ltsleep+0x4b3
biowait(c45fcea4,f0d4b9d0,595,cb535000,eafa612c) at netbsd:biowait+0xeb
genfs_gop_write(eafa612c,f0d4ba14,1,0,eafa612c) at netbsd:genfs_gop_write+0x2cc
genfs_putpages(f0d4bba4,989680,0,eafa612c,c05cdba0) at netbsd:genfs_putpages+0x999
VOP_PUTPAGES(eafa612c,0,0,2000,0) at netbsd:VOP_PUTPAGES+0x40
ffs_write(f0d4bcd4,3fd7c853,f0d4bd0c,c03aa710,c05cd260) at netbsd:ffs_write+0x666
VOP_WRITE(eafa612c,f0d4bd84,37,c4474380,eafa612c) at netbsd:VOP_WRITE+0x34
vn_write(e82df964,e82df98c,f0d4bd84,c4474380,1) at netbsd:vn_write+0xc0
ktrwrite(e874e3b4,f0d4bdd4,7,c035ec97,6) at netbsd:ktrwrite+0xd4
ktremul(e874e3b4,e874e3b4,0,c0779080,0) at netbsd:ktremul+0x4b
ktrops(e874e3b4,e874e3b4,0,3be,e82df964) at netbsd:ktrops+0xad
ktrace_common(e874e3b4,0,3be,2c68,e82df964) at netbsd:ktrace_common+0xed
sys_ktrace(e8763800,f0d4bf64,f0d4bf5c,2d,c06dd8f5) at netbsd:sys_ktrace+0xf1
syscall_plain(f0d4bfa8,1f,1f,1f,1f) at netbsd:syscall_plain+0x18a

db{0}> mac cpu 6
using cpu 6
db{0}> tr       
__cpu_simple_lock(c07798d4,42c1d80,0,c384f8c4,0) at netbsd:__cpu_simple_lock+0x6f
_simple_lock(c07798d4,c06de25f,2a9,c3780800,400) at netbsd:_simple_lock+0x75
wakeup(c384f8c4,c06dec2d,428,c384f8e0,c384f8e0) at netbsd:wakeup+0x55
pipe_write(e82df504,e82df52c,f4174ec4,c366b000,1) at netbsd:pipe_write+0x31f
dofilewrite(e8adccb4,1,e82df504,804d400,400) at netbsd:dofilewrite+0x85
sys_write(e6c77d48,f4174f64,f4174f5c,4,c07edbd8) at netbsd:sys_write+0x6f
syscall_plain(f4174fa8,804001f,804001f,481a001f,bfbf001f) at netbsd:syscall_plain+0x18a

Two almost unrelated comments:

 - Should the ddb prompt change to "db{6}> " after issuing the
   "machine cpu 6" command?

 - I have never been able to get a crash dump from an MP box.  Anyone
   _ever_ had this work?

Simon.
--
Simon Burge                            <simonb@wasabisystems.com>
NetBSD Support and Service:         http://www.wasabisystems.com/