Subject: Re: panic: lockmgr: release of unlocked lock
To: Bill Studenmund <wrstuden@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 01/14/2005 16:55:39
On Mon, Jan 10, 2005 at 01:35:51PM -0800, Bill Studenmund wrote:
> On Sat, Jan 08, 2005 at 01:25:23PM +0100, Manuel Bouyer wrote:
> > Hi,
> > I just got this on a 2.0/alpha system doing a bulk build:
> > panic: lockmgr: release of unlocked lock!
> > db> tr
> > cpu_Debugger() at netbsd:cpu_Debugger+0x4
> > panic() at netbsd:panic+0x1f8
> > lockmgr() at netbsd:lockmgr+0x308
> > layer_unlock() at netbsd:layer_unlock+0x98
> > VOP_UNLOCK() at netbsd:VOP_UNLOCK+0x3c
> > vput() at netbsd:vput+0x5c
> > lookup() at netbsd:lookup+0x44c
>
> Can you get a line number for the above?
Ok, so here is what I can show for this:
(gdb) l *(lockmgr+0x308)
0xfffffc0000569f38 is in lockmgr (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/kern_lock.c:848).
843 }
844 if ((lkp->lk_flags & (LK_WAITDRAIN|LK_SPIN)) == LK_WAITDRAIN &&
845 ((lkp->lk_flags &
846 (LK_HAVE_EXCL | LK_WANT_EXCL | LK_WANT_UPGRADE |
847 LK_SHARE_NONZERO | LK_WAIT_NONZERO)) == 0)) {
848 lkp->lk_flags &= ~LK_WAITDRAIN;
849 wakeup((void *)&lkp->lk_flags);
850 }
851 /*
852 * Note that this panic will be a recursive panic, since
(gdb) l *(layer_unlock+0x98)
0xfffffc00005dd5c8 is in layer_unlock (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/miscfs/genfs/layer_vnops.c:676).
671 if (flags & LK_INTERLOCK) {
672 simple_unlock(&vp->v_interlock);
673 flags &= ~LK_INTERLOCK;
674 }
675 VOP_UNLOCK(LAYERVPTOLOWERVP(vp), flags);
676 return (lockmgr(&vp->v_lock, ap->a_flags | LK_RELEASE,
677 &vp->v_interlock));
678 }
679 }
680
(gdb) l *(VOP_UNLOCK+0x3c)
0xfffffc00005d742c is in VOP_UNLOCK (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vnode_if.c:1111).
1106 {
1107 struct vop_unlock_args a;
1108 a.a_desc = VDESC(vop_unlock);
1109 a.a_vp = vp;
1110 a.a_flags = flags;
1111 return (VCALL(vp, VOFFSET(vop_unlock), &a));
1112 }
1113 #endif
1114
1115 const int vop_bmap_vp_offsets[] = {
(gdb) l *(vput+0x5c)
0xfffffc00005c92ec is in vput (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vfs_subr.c:1298).
1293 #endif
1294 simple_lock(&vp->v_interlock);
1295 vp->v_usecount--;
1296 if (vp->v_usecount > 0) {
1297 simple_unlock(&vp->v_interlock);
1298 VOP_UNLOCK(vp, 0);
1299 return;
1300 }
1301 #ifdef DIAGNOSTIC
1302 if (vp->v_usecount < 0 || vp->v_writecount != 0) {
(gdb) l *(lookup+0x44c)
0xfffffc00005c6a4c is in lookup (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vfs_lookup.c:663).
658 vrele(ndp->ni_dvp);
659 bad:
660 if (dpunlocked)
661 vrele(dp);
662 else
663 vput(dp);
664 ndp->ni_vp = NULL;
665 return (error);
666 }
667
>
> > Just before the panic there is:
> > vnode: table is full - increase kern.maxvnodes or NVNODE
> >
> > I don't know if it's related or not.
>
> It might be. Please also put a diagnostic printf() near line 473 of
> layerfs_lookup(), in the if (error) case just after layer_node_create().
> There may be an inconsistency in how we handle an error in case that
> create fails.
BTW, I also got 2 reboots on *.fr.netbsd.org. I don't know why,
unfortunably the swap partition is too small for a core dump (this is
what you get when you update hardware :). However, in both cases
I had "vnode: table is full" in the logs a few minutes before the crash.
I bumped kern.maxvnodes, the box didn't panic since then. But in the
same time, the announce of anoncvs.netbsd.org being usable again has
decreased the load a little bit ... This box is a dual-CPU i386.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--