Subject: Re: vnode refcount panic, perhaps due to kern/vfs_lookup.c:lookup()
To: Jaromir Dolecek <firstname.lastname@example.org>
From: Jaromir Dolecek <email@example.com>
Date: 03/16/2003 21:52:45
Jaromir Dolecek wrote:
> Locking rules for symlink vnode op have changed some time ago (rev. 1.26
> of coda/coda_vnops.c), perhaps the change triggered some
> problem in coda?
> I'd probably also check that the lookup() call in coda_symlink()
> succeeds, and that nd.ni_vp is indeed NULL in that case, since
Err, 'is NULL if lookup() fails' was what I meant.
> that appears to be what the code assumes.
> Greg Troxel wrote:
> > I found that the double-vput problem in vfs_lookup was due to a vnode
> > with type V_BAD. This is passed to vfs_lookup from coda_symlink.
> > Most of the time, the coda_call to symlink in coda_symlink works, and
> > occasionally the call returns without error but the vnode is marked
> > VBAD.
> > I checked for VBAD, and returned -1, but promptly got a panic in
> > nfs_symlink, I think because an mbuf that was free()'d was trashed or
> > just a bad pointer.
> > So, I'm guessing that the coda kernel code occasionally messes up, or
> > there is some locking problem where the vnode gets modified/marked bad
> > by something else. This is all on a 192 MB i386 running
> > cfsd/rpcbind/mountd, venus, bash, emacs, sshd/ntpd/etc. and 3 more
> > gettys. There is basically nothing else going on, and the machine was
> > freshly booted.
> > I am just beginning to grasp the locking rules, and I'd appreciate
> > being set straight if I am confused (and thanks to those who already
> > responeded):
> > the interlock in the vnode protects the vnode ref counts and a few
> > other fields in the struct vnode. It is held for short periods only
> > and is not about locking the vnode itself.
> > Having a reference, expressed via the ref count field, protects you
> > against the vnode going away or turning into something completely
> > different. But it does not guarantee anything about operations on
> > the vnode; to serialize those, the vn_lock is used.
> > struct lock v_lock in the vnode protects the vnode in the larger
> > context in terms of fs operations.
> > When the comments say 'the locked vnode', they always mean the
> > struct lock in the vnode (or rather v->v_vnlock, which in the coda
> > case always points to v->v_lock since there is no stackable fs stuff
> > going on).
> > Little mention is made of the interlock in terms of locking
> > discussions, other than in vnode(9), because that's too obvious.
> > vput, for example, expects that the interlock is not held. It
> > unlocks *v->vn_lock, and then decrements usecount. To do the
> > latter, it has to acquire the interlock, but that's not mentioned.
> > One should in general not hold the interlock when calling VOP_LOCK
> > and VOP_UNLOCK or other vnops. But some operations take the
> > LK_INTERLOCK flag to indicate that the interlock is already held.
> > So, is it reasonable for an unlocked vnode to change to VBAD?
> > Does holding the vn_lock mean that vgone should not be called?
> > Is there any place else I should suspect that is changing the type to
> > VBAD?
> > Greg Troxel <firstname.lastname@example.org>
> Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.org/
> -=- We should be mindful of the potential goal, but as the tantric -=-
> -=- Buddhist masters say, ``You may notice during meditation that you -=-
> -=- sometimes levitate or glow. Do not let this distract you.'' -=-
Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.org/
-=- We should be mindful of the potential goal, but as the tantric -=-
-=- Buddhist masters say, ``You may notice during meditation that you -=-
-=- sometimes levitate or glow. Do not let this distract you.'' -=-