Subject: kern/29670: "release of unlocked lock" panic with null fs
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <raeburn@raeburn.org>
List: netbsd-bugs
Date: 03/12/2005 09:28:00
>Number:         29670
>Category:       kern
>Synopsis:       "release of unlocked lock" panic with null fs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 12 09:28:00 +0000 2005
>Originator:     Ken Raeburn
>Release:        NetBSD 2.0
>Organization:
	mit
>Environment:
System: NetBSD raeburn.org 2.0 NetBSD 2.0 (THUD) #0: Mon Dec 13 12:23:18 EST 2004 root@thud:/usr/obj/sys/arch/i386/compile/THUD i386
Architecture: i386
Machine: i386
>Description:

Since updating to 2.0, I've seen several kernel panics of this type,
always with traceback through layer_vnops.c and always soon after a
report of a vnode or inode table being full (I forget which, and the
message doesn't usually seem to make it to the log, though it does get
displayed on the console).  The specific operation being done in the
null fs varies -- open or rename, maybe others.

Process listings with "ps -auxg -M netbsd.#.core" show all processes
(except for zombies) in "R" state, so I'm not actually sure which was
running in each case.  I have three "null" mounts, two that are
exported via NFS, and one that's used by an OpenAFS file server, so
they're all likely to be somewhat active.

The kernel also usually hangs "syncing disks" after this; only
occasionally do I get a crash dump to look at.

Here are a couple of tracebacks from crash dumps:

#0  0x1fef0000 in ?? ()
#1  0xc03f33db in cpu_reboot (howto=256, bootstr=0x0)
    at /usr/src/sys/arch/i386/i386/machdep.c:745
#2  0xc0367600 in panic (fmt=0xc0700120 "lockmgr: release of unlocked lock!")
    at /usr/src/sys/kern/subr_prf.c:242
#3  0xc034aae3 in lockmgr (lkp=0xcca97de8, flags=6, interlkp=0xcc879928)
    at /usr/src/sys/kern/kern_lock.c:563
#4  0xc0397d2a in layer_unlock (v=0xcc49bc84)
    at /usr/src/sys/miscfs/genfs/layer_vnops.c:676
#5  0xc03936dc in VOP_UNLOCK (vp=0xcc879928, flags=0)
    at /usr/src/sys/kern/vnode_if.c:1111
#6  0xc0397da9 in layer_inactive (v=0xcc49bcd4)
    at /usr/src/sys/miscfs/genfs/layer_vnops.c:752
#7  0xc0393658 in VOP_INACTIVE (vp=0xcc879928, p=0xcc4594d4)
    at /usr/src/sys/kern/vnode_if.c:1024
#8  0xc038a4d1 in vput (vp=0xcc879928) at /usr/src/sys/kern/vfs_subr.c:1322
#9  0xc038888b in lookup (ndp=0xcc49beb4) at /usr/src/sys/kern/vfs_lookup.c:663
#10 0xc0388288 in namei (ndp=0xcc49beb4) at /usr/src/sys/kern/vfs_lookup.c:172
#11 0xc0392365 in vn_open (ndp=0xcc49beb4, fmode=1, cmode=0)
    at /usr/src/sys/kern/vfs_vnops.c:164
#12 0xc038e09b in sys_open (l=0xcc46c190, v=0xcc49bf64, retval=0xcc49bf5c)
    at /usr/src/sys/kern/vfs_syscalls.c:1128
#13 0xc03fb41a in syscall_plain (frame=0xcc49bfa8)
    at /usr/src/sys/arch/i386/i386/syscall.c:156


#0  0x1fef0000 in ?? ()
#1  0xc03f33db in cpu_reboot (howto=256, bootstr=0x0)
    at /usr/src/sys/arch/i386/i386/machdep.c:745
#2  0xc0367600 in panic (fmt=0xc0700120 "lockmgr: release of unlocked lock!")
    at /usr/src/sys/kern/subr_prf.c:242
#3  0xc034aae3 in lockmgr (lkp=0xcc3e4e54, flags=6, interlkp=0xcc3ed338)
    at /usr/src/sys/kern/kern_lock.c:563
#4  0xc0397d2a in layer_unlock (v=0xcd11bcf4)
    at /usr/src/sys/miscfs/genfs/layer_vnops.c:676
#5  0xc03936dc in VOP_UNLOCK (vp=0xcc3ed338, flags=0)
    at /usr/src/sys/kern/vnode_if.c:1111
#6  0xc0397da9 in layer_inactive (v=0xcd11bd44)
    at /usr/src/sys/miscfs/genfs/layer_vnops.c:752
#7  0xc0393658 in VOP_INACTIVE (vp=0xcc3ed338, p=0xcbe77664)
    at /usr/src/sys/kern/vnode_if.c:1024
#8  0xc038a4d1 in vput (vp=0xcc3ed338) at /usr/src/sys/kern/vfs_subr.c:1322
#9  0xc038888b in lookup (ndp=0xcd11beb4) at /usr/src/sys/kern/vfs_lookup.c:663
#10 0xc0388288 in namei (ndp=0xcd11beb4) at /usr/src/sys/kern/vfs_lookup.c:172
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
can not access 0xbfbfc9d0, invalid translation (invalid PDE)
#11 0xc0391408 in rename_files (
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
can not access 0xbfbfd9e0, invalid translation (invalid PDE)
    from=0xbfbfc9d0 <Address 0xbfbfc9d0 out of bounds>, 
    to=0xbfbfd9e0 <Address 0xbfbfd9e0 out of bounds>, p=0xcbe77664, retain=0)
    at /usr/src/sys/kern/vfs_syscalls.c:3184
#12 0xc039139b in sys_rename (l=0xcbe7badc, v=0xcd11bf64, retval=0xcd11bf5c)
    at /usr/src/sys/kern/vfs_syscalls.c:3139
#13 0xc03fb41a in syscall_plain (frame=0xcd11bfa8)
    at /usr/src/sys/arch/i386/i386/syscall.c:156

This kernel config is based on GENERIC, with a few tweaks.  (Drop i486
and math emulation.  Add diagnostic and debug options.  Add ipsec.
Drop OSI, x.25, appletalk.  Add ast0, drop various IDE controllers.
Add stf and faith pseudo-devices.  Maybe a couple other tweaks to the
device list.)

>How-To-Repeat:

	Use null mounts and run lots of file server stuff out of them,
	and wait?

>Fix:
	?