Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 5.99.42/i386 crash (backtrace + core available)



On Tue, Dec 28, 2010 at 05:37:43PM +0100, Dennis den Brok wrote:
 > rw_abort()
 > rw_vector_enter(df829668, ...)
                   ^^^^^^^^
 > genfs_lock()
 > layer_bypass()
 > VOP_LOCK(e15c2170,2)
            ^^^^^^^^
 > vclean()
 > getcleanvnode()
 > getnewvnode()
 > ffs_vget()
 > ufs_lookup()
 > VOP_LOOKUP(df8295c8,...)
              ^^^^^^^^

Unfortunately most of the things visible in the stack trace are vnode
op argument structures and not pointers to anything interesting.
However, since rw_vector_enter is passed &vp->v_lock, I think we can
tentatively conclude that it's trying to lock the same vnode that was
passed to VOP_LOOKUP, and it's failing because that's quite properly
already locked.

It looks like what happened is that ffs went to get a fresh vnode and
got a not-recently-used nullfs vnode. However, the nullfs vnode turned
out to be the nullfs vnode sitting on top of the ffs vnode it was
already working with. Since these share locks now, the vnode was
locked even though not recently used (and on the list to be cleaned
and all that), and in fact it turned out to be the same ffs vnode this
process was already working on, so trying to lock it for cleaning blew
up.

So this seems like fallout from Juergen's layer locking cleanup from a
few months ago. Not sure what the proper solution is, though.

-- 
David A. Holland
dholland%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index