Subject: Re: kern/29652
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: David Laight <david@l8s.co.uk>
List: netbsd-bugs
Date: 08/20/2005 13:29:01
The following reply was made to PR kern/29652; it has been noted by GNATS.

From: David Laight <david@l8s.co.uk>
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/29652
Date: Sat, 20 Aug 2005 14:27:48 +0100

 On Thu, Aug 18, 2005 at 05:29:10PM +0900, YAMAMOTO Takashi wrote:
 > >  panic: kernel diagnostic assertion "p->p_nrlwps == 0" failed: file "/usr/src/sys/kern/kern_exit.c", line 781
 > 
 > as p_nrlwps currently has no locking afaik, the panic is not surprising. :)
 > 
 > proc.h claims it's protected by p_lock, but i think
 > sched_lock is more straightforward lock to use.
 
 Last time I looked (and I've not seem anything that might affect it) the
 locking of the parent/child heirarchy (and lwp one) wasn't correct.
 
 I thought about correcting it - and may have made a few changes - but
 it is basically a 'lost cause'.  The killer activity is the 're-parenting'
 that goes on when a process is run under gdb.
 
 Even a process exiting when the parent has set SA_NOCLDWAIT causes grief.
 Possibly fixable with a carefully constructed lock hierarchy.
 (linux leaves a zombie lurking until the parent could take the signal)
 
 Using a single lock (instead of the per-process p_lock) would make it
 possible to avoid lock hierarchy violations, at only a small cost on
 systems with many cpus (where contention for the lock might be a problem,
 or the lock itself become a 'memory hotspot') - neither of which is
 likely to be a problem until after the BIG LOCK is dead and buried.
 
 	David
 
 -- 
 David Laight: david@l8s.co.uk