tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ufs_rename() deadlock

On Fri, Sep 25, 2009 at 04:24:33PM +0200, Manuel Bouyer wrote:
> Hello,
> I think I've found the reason of the deadlock between ufs_rename() and
> cache_lookup() (see "tstile syndrome" thread on this list). The deadlock
> would exist in both ufs_rename() and wapbl_ufs_rename() which would explain
> why some people see it with or without -o log mounts.
> cache_lookup() does (kern/vfs_cache.c around line 400):
>       error = vn_lock(vp, LK_EXCLUSIVE);
>       vn_lock(dvp, LK_EXCLUSIVE | LK_RETRY);
> wapbl_ufs_rename() and ufs_rename() both do:
>     vn_lock(fvp) just after basic checks for permission, cross-device, etc
>     vn_lock(fdvp) in the "unlink the source" case, or if (newparent).
> Which one is right ? Should the parent be locked before the entry, or
> the opposite ?
> Assuming the former, would unlocking fvp before locking fdvp work ?
> We're about to do a relookup() so it looks like it could be safely done,
> provided we check nothing bad happened while fvp was unlocked.
> And now this is the question: how do we check that nothing bad happened ?
> Note that as we hold a reference to the vnode, maybe there's no more
> to check than the return value from vn_lock() ?

Sorry, this is wrong, I didn't see a VOP_UNLOCK(fvp).

Another case looks suspicious, is when fdvp is the parent of tdvp.
In this case, we lock in the reverse order from cache_lookup(), and
in the non-wapbl case this doesn't happen because  the call to vn_lock(fdvp)
happends after unlocking tdvp. A comment says this was moved here to avoid
a deadlock with WAPBL,  does anyone knows the details of this deadlock ?
A XXX comment says "handle case where fdvp is parent of tdvp by unlocking
tdvp and regrabbing it with vget after", what would be the implications of
this ? Would it be better with VOP_UNLOCK()/vn_lock() instead of
vput()/vrele() ?

Another way to handle this would be to split the WAPBL transaction in 2
pieces: first the link to the target; then the unlink from the source.
A rename is not atomic anymore in this case, as in the non-WAPBL case.
Is it a problem as long as the on-disk state is consistent ?

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index