Subject: Re: locking bug in coda_lookup?
To: Greg Troxel <>
From: Jaromir Dolecek <>
List: tech-kern
Date: 09/05/2004 12:00:51
Greg Troxel wrote:
> I got a panic in coda_lookup:
>   unlocked parent but couldn't lock child
> while doing
>   find . -type f -print0 | xargs -0 cat > /dev/null
> in coda.  I'm not certain which of two very similarfragments the code
> was in (don't have netbsd.gdb any more), but it was very much like
> this:
> 	    if (*ap->a_vpp) {
> 		if ((error = vn_lock(*ap->a_vpp, LK_EXCLUSIVE))) {
> 		    printf("coda_lookup: ");
> 		    panic("unlocked parent but couldn't lock child");
> 		}
> 	    }
> It seems that this will fail if someone else has the vnode locked, and
> there will be no retry.  From reading vn_lock, it seems that if one
> passes LK_NOWAIT, that on encountering a locked vnode, vn_lock returns
> immediately.  If LK_NOWAIT is not set, and LK_RETRY is also not set,
> it seems that vn_lock will tsleep on the vnode's v_interlock, and then
> return ENOENT instead of retrying.  The ufs code uses LK_RETRY in what
> I think is the analogous case.
> I don't understand why it makes sense to sleep and not retry the lock
> - if one isn't going to retry, what's the point of sleeping?
> So, I think that coda_lookup should pass LK_RETRY.  But I either don't
> quite or just barely understand vnode locking, so I'd appreciate
> advice here.

Your analysis might be right. Does it work fine if you try
changing the source to use LK_EXCLUSIVE|LK_RETRY in
the vn_lock() call?

> Also, it seems that the coda lookup operation doesn't properly handle
> IS_DOTDOT where the locking rules are different.  That could be
> related to my crash instead.

Right. If that is the case, the find will deadlock instead :)

Jaromir Dolecek <>  
-=- We should be mindful of the potential goal, but as the Buddhist -=-
-=- masters say, ``You may notice during meditation that you        -=-
-=- sometimes levitate or glow.   Do not let this distract you.''   -=-