Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 5.99.42/i386 crash (backtrace + core available)



On Sat, Jan 08, 2011 at 11:16:19PM +0000, David Holland wrote:
> On Tue, Dec 28, 2010 at 05:37:43PM +0100, Dennis den Brok wrote:
>  > rw_abort()
>  > rw_vector_enter(df829668, ...)
>                    ^^^^^^^^
>  > genfs_lock()
>  > layer_bypass()
>  > VOP_LOCK(e15c2170,2)
>             ^^^^^^^^
>  > vclean()
>  > getcleanvnode()
>  > getnewvnode()
>  > ffs_vget()
>  > ufs_lookup()
>  > VOP_LOOKUP(df8295c8,...)
>               ^^^^^^^^
> 
> Unfortunately most of the things visible in the stack trace are vnode
> op argument structures and not pointers to anything interesting.
> However, since rw_vector_enter is passed &vp->v_lock, I think we can
> tentatively conclude that it's trying to lock the same vnode that was
> passed to VOP_LOOKUP, and it's failing because that's quite properly
> already locked.
> 
> It looks like what happened is that ffs went to get a fresh vnode and
> got a not-recently-used nullfs vnode. However, the nullfs vnode turned
> out to be the nullfs vnode sitting on top of the ffs vnode it was
> already working with. Since these share locks now, the vnode was
> locked even though not recently used (and on the list to be cleaned
> and all that), and in fact it turned out to be the same ffs vnode this
> process was already working on, so trying to lock it for cleaning blew
> up.
> 
> So this seems like fallout from Juergen's layer locking cleanup from a
> few months ago. Not sure what the proper solution is, though.

Looks like we have

        vnode e15c2170, type nullfs with lower vnode df8295c8, type ffs

so here we lookup the lower vnode, need a new one and hit the upper (nullfs)
vnode.  Trying to lock it always has to lock the lower vnode.  This is the
background of ad's comment in vclean

        /* XXXAD should not lock vnode under layer */
        mutex_exit(&vp->v_interlock);
        VOP_LOCK(vp, LK_EXCLUSIVE);

I have no idea how to solve this mess...
-- 
Juergen Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig 
(Germany)


Home | Main Index | Thread Index | Old Index