Re: 5.99.42/i386 crash (backtrace + core available)

To: current-users%netbsd.org@localhost
Subject: Re: 5.99.42/i386 crash (backtrace + core available)
From: Juergen Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>
Date: Sun, 9 Jan 2011 09:06:16 +0100

On Sat, Jan 08, 2011 at 11:16:19PM +0000, David Holland wrote:
> On Tue, Dec 28, 2010 at 05:37:43PM +0100, Dennis den Brok wrote:
>  > rw_abort()
>  > rw_vector_enter(df829668, ...)
>                    ^^^^^^^^
>  > genfs_lock()
>  > layer_bypass()
>  > VOP_LOCK(e15c2170,2)
>             ^^^^^^^^
>  > vclean()
>  > getcleanvnode()
>  > getnewvnode()
>  > ffs_vget()
>  > ufs_lookup()
>  > VOP_LOOKUP(df8295c8,...)
>               ^^^^^^^^
> 
> Unfortunately most of the things visible in the stack trace are vnode
> op argument structures and not pointers to anything interesting.
> However, since rw_vector_enter is passed &vp->v_lock, I think we can
> tentatively conclude that it's trying to lock the same vnode that was
> passed to VOP_LOOKUP, and it's failing because that's quite properly
> already locked.
> 
> It looks like what happened is that ffs went to get a fresh vnode and
> got a not-recently-used nullfs vnode. However, the nullfs vnode turned
> out to be the nullfs vnode sitting on top of the ffs vnode it was
> already working with. Since these share locks now, the vnode was
> locked even though not recently used (and on the list to be cleaned
> and all that), and in fact it turned out to be the same ffs vnode this
> process was already working on, so trying to lock it for cleaning blew
> up.
> 
> So this seems like fallout from Juergen's layer locking cleanup from a
> few months ago. Not sure what the proper solution is, though.

While the analysis looks ok I don't think layer locking cleanup is the
reason.  Before the cleanup locks were shared so getcleanvnode() would
use the same lock without layer_bypass().

Dennis, to be sure you could build kernel/userland from somewhere in
september 2010 - this is after all my changes to vnode locking.

-- 
Juergen Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig 
(Germany)

Follow-Ups:
- Re: 5.99.42/i386 crash (backtrace + core available)
  - From: Dennis den Brok

References:
- 5.99.42/i386 crash (backtrace + core available)
  - From: Dennis den Brok
- Re: 5.99.42/i386 crash (backtrace + core available)
  - From: David Holland

Prev by Date: daily CVS update output
Next by Date: Re: i386 cd9660 primary bootstrap
Previous by Thread: Re: 5.99.42/i386 crash (backtrace + core available)
Next by Thread: Re: 5.99.42/i386 crash (backtrace + core available)
Indexes:

Home | Main Index | Thread Index | Old Index