Subject: Re: panic: lockmgr: release of unlocked lock
To: Bill Studenmund <wrstuden@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 01/14/2005 16:55:39
On Mon, Jan 10, 2005 at 01:35:51PM -0800, Bill Studenmund wrote:
> On Sat, Jan 08, 2005 at 01:25:23PM +0100, Manuel Bouyer wrote:
> > Hi,
> > I just got this on a 2.0/alpha system doing a bulk build:
> > panic: lockmgr: release of unlocked lock!
> > db> tr
> > cpu_Debugger() at netbsd:cpu_Debugger+0x4
> > panic() at netbsd:panic+0x1f8
> > lockmgr() at netbsd:lockmgr+0x308
> > layer_unlock() at netbsd:layer_unlock+0x98
> > VOP_UNLOCK() at netbsd:VOP_UNLOCK+0x3c
> > vput() at netbsd:vput+0x5c
> > lookup() at netbsd:lookup+0x44c
> 
> Can you get a line number for the above?

Ok, so here is what I can show for this:
(gdb) l *(lockmgr+0x308)
0xfffffc0000569f38 is in lockmgr (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/kern_lock.c:848).
843             }
844             if ((lkp->lk_flags & (LK_WAITDRAIN|LK_SPIN)) == LK_WAITDRAIN &&
845                 ((lkp->lk_flags &
846                   (LK_HAVE_EXCL | LK_WANT_EXCL | LK_WANT_UPGRADE |
847                   LK_SHARE_NONZERO | LK_WAIT_NONZERO)) == 0)) {
848                     lkp->lk_flags &= ~LK_WAITDRAIN;
849                     wakeup((void *)&lkp->lk_flags);
850             }
851             /*
852              * Note that this panic will be a recursive panic, since
(gdb) l *(layer_unlock+0x98)
0xfffffc00005dd5c8 is in layer_unlock (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/miscfs/genfs/layer_vnops.c:676).
671                     if (flags & LK_INTERLOCK) {
672                             simple_unlock(&vp->v_interlock);
673                             flags &= ~LK_INTERLOCK;
674                     }
675                     VOP_UNLOCK(LAYERVPTOLOWERVP(vp), flags);
676                     return (lockmgr(&vp->v_lock, ap->a_flags | LK_RELEASE,
677                             &vp->v_interlock));
678             }
679     }
680     
(gdb) l *(VOP_UNLOCK+0x3c)
0xfffffc00005d742c is in VOP_UNLOCK (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vnode_if.c:1111).
1106    {
1107            struct vop_unlock_args a;
1108            a.a_desc = VDESC(vop_unlock);
1109            a.a_vp = vp;
1110            a.a_flags = flags;
1111            return (VCALL(vp, VOFFSET(vop_unlock), &a));
1112    }
1113    #endif
1114    
1115    const int vop_bmap_vp_offsets[] = {
(gdb) l *(vput+0x5c)
0xfffffc00005c92ec is in vput (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vfs_subr.c:1298).
1293    #endif
1294            simple_lock(&vp->v_interlock);
1295            vp->v_usecount--;
1296            if (vp->v_usecount > 0) {
1297                    simple_unlock(&vp->v_interlock);
1298                    VOP_UNLOCK(vp, 0);
1299                    return;
1300            }
1301    #ifdef DIAGNOSTIC
1302            if (vp->v_usecount < 0 || vp->v_writecount != 0) {
(gdb) l *(lookup+0x44c)
0xfffffc00005c6a4c is in lookup (/local/pop1/bouyer/netbsd-2-0-RELEASE/src/sys/kern/vfs_lookup.c:663).
658             vrele(ndp->ni_dvp);
659     bad:
660             if (dpunlocked)
661                     vrele(dp);
662             else
663                     vput(dp);
664             ndp->ni_vp = NULL;
665             return (error);
666     }
667     

> 
> > Just before the panic there is:
> > vnode: table is full - increase kern.maxvnodes or NVNODE
> > 
> > I don't know if it's related or not.
> 
> It might be. Please also put a diagnostic printf() near line 473 of 
> layerfs_lookup(), in the if (error) case just after layer_node_create(). 
> There may be an inconsistency in how we handle an error in case that 
> create fails.

BTW, I also got 2 reboots on *.fr.netbsd.org. I don't know why,
unfortunably the swap partition is too small for a core dump (this is
what you get when you update hardware :). However, in both cases
I had "vnode: table is full" in the logs a few minutes before the crash.
I bumped kern.maxvnodes, the box didn't panic since then. But in the
same time, the announce of anoncvs.netbsd.org being usable again has
decreased the load a little bit ... This box is a dual-CPU i386.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--