Subject: Re: Strange hang on 4.99.49
To: Paul Goyette <paul@whooppee.com>
From: Andrew Doran <ad@netbsd.org>
List: current-users
Date: 01/24/2008 14:33:49
On Wed, Jan 23, 2008 at 07:28:02PM +0000, Andrew Doran wrote:
> On Tue, Jan 22, 2008 at 12:48:38PM -0800, Paul Goyette wrote:
> > I'm running a kernel + userland from sources as of just a few hours ago.
> > 
> > I started up a build.sh with -j 4 and all of a sudden, after an hour or 
> > more of running, the system just went completely idle even though the 
> > build is nowhere near complete.
> > 
> > Looking around, I found three cc1 processes (only 3, not 4).  One of 
> > them was in a 'vnode' state while the other two are in 'vnlock'.  Both 
> > my /usr/src and /usr/obj are null-mounts
> > 
> > 	/dev/wd0a on / type ffs (NFS exported, local)
> > 	/dev/wd0e on /var type ffs (local)
> > 	/dev/wd0g on /home type ffs (NFS exported, local)
> > >>>	/dev/wd0h on /build type ffs (soft dependencies, NFS exported, local)
> > 	/dev/wd1a on /amanda type ffs (local)
> > >>>	/build/src on /usr/src type null (local)
> > >>>	/build/obj on /usr/obj type null (local)
> > 	/build/xsrc on /usr/xsrc type null (local)
> > 	/build/pkgsrc on /usr/pkgsrc type null (local)
> > 	kernfs on /kern type kernfs (local)
> > 	ptyfs on /dev/pts type ptyfs (local)
> > 	tmpfs on /tmp type tmpfs (local)
> > 
> > Is there some sort of locking race condition that I've managed to hit? 
> > If so, is there a way to avoid it?
> 
> I think it's a lock order reversal, and it occurs with nullfs. I can
> reproduce it on a test machine here and am looking into it.
I think this one should be fixed now, can you give it a go?
Andrew