Subject: kern/35542: NFS rename(?) panics (panic: lockmgr: release of unlocked lock!)
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <arto@selonen.org>
List: netbsd-bugs
Date: 02/02/2007 08:05:00
>Number: 35542
>Category: kern
>Synopsis: NFS rename(?) panics (panic: lockmgr: release of unlocked lock!)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Feb 02 08:05:00 +0000 2007
>Originator: Arto Selonen
>Release: NetBSD-current 4.99.9 ~20070201
>Organization:
>Environment:
NetBSD blah 4.99.9 NetBSD 4.99.9 (BLAH) #4: Thu Feb 1 16:13:51 EET 2007 blah@blah:/obj/sys/arch/i386/compile/BLAH i386
>Description:
The system is a NFS server serving 2 1TB partitions from a twelve disk RAID array (3ware Escalade).
The system was upgraded on January 25th (previous upgrade was on November 28th), and ran without problems for roughly a week. Then on February 1st, it paniced ("panic: lockmgr: release of unlocked lock!"). Repeated reboots resulted in similar panics pretty much as soon as network interface went up. Booting to single user and turning NFS services off made system stable (and NFS disks inaccessible).
The system was then upgraded on February 1st with whatever sources anoncvs gave, and then NFS services were turned back on. After a reboot, once network interface came up, it paniced again.
At the moment, I don't have any network traces for possible client traffic, but I have a "db> reboot 0x104" crash dump of the latest panic, and the following function call trace (just to give an idea of what is going on):
panic: lockmgr: release of unlocked lock!
Stopped in pid 542.1 (nfsd) at netbsd:cpu_Debugger
db> tr
cpu_Debugger
panic
lockmgr
nfs_unlock
VOP_UNLOCK
ufs_inactive
VOP_INACTIVE
vput
nfsrv_rename
nfssvc_nfsd
sys_nfssvc
syscall_plain
Purely guessing from the trace and recent source changes, with simple string matching, I'm guessing this might have something to do with eg. these (of course I could be way off here, as I have no idea of the functional relevance, this is purely from browsing commit messages from December-January for "relevant" strings):
http://mail-index.netbsd.org/source-changes/2006/12/27/0030.html
http://mail-index.netbsd.org/source-changes/2007/01/01/0030.html
http://mail-index.netbsd.org/source-changes/2007/01/07/0045.html
http://mail-index.netbsd.org/source-changes/2007/01/07/0046.html
I have the following crash dump available:
-rw------- 1 root wheel 10021802 Feb 2 09:13 netbsd.2.core.gz
-rw------- 1 root wheel 1732192 Feb 2 09:13 netbsd.2.gz
Due to privacy issues, I can not provide those files, but I'm willing to follow instructions on how to access them, if needed.
Kernel config has not been touched in over a year:
include "arch/i386/conf/std.i386"
options INCLUDE_CONFIG_FILE
maxusers 32
options I686_CPU
options VM86
options MTRR
options INSECURE
options RTC_OFFSET=0
options NTP
options KTRACE
options SYSTRACE
options SYSVMSG
options SYSVSEM
options SYSVSHM
options P1003_1B_SEMAPHORE
options NMBCLUSTERS=16384
options LKM
options USERCONF
options BEEP_ONHALT
options DIAGNOSTIC
options DEBUG
options KMEMSTATS
options DDB
options DDB_ONPANIC=1
options DDB_HISTORY_SIZE=512
makeoptions DEBUG="-g"
options COMPAT_16
options COMPAT_BSDPTY
file-system FFS
file-system EXT2FS
file-system LFS
file-system MFS
file-system NFS
file-system CD9660
file-system MSDOSFS
file-system KERNFS
file-system NULLFS
file-system OVERLAY
file-system PORTAL
file-system PROCFS
file-system UMAPFS
file-system UNION
options QUOTA
options SOFTDEP
options NFSSERVER
options GATEWAY
options INET
options IPSEC
options IPSEC_ESP
options PPP_BSDCOMP
options PPP_DEFLATE
options PPP_FILTER
options PFIL_HOOKS
options IPFILTER_LOG
options IPFILTER_DEFAULT_BLOCK
options MIIVERBOSE
options PCIVERBOSE
options USBVERBOSE
options PNPBIOSVERBOSE
options WSEMUL_VT100
options WS_KERNEL_FG=WSCOL_GREEN
options WSDISPLAY_COMPAT_PCVT
options WSDISPLAY_COMPAT_SYSCONS
options WSDISPLAY_COMPAT_USL
options WSDISPLAY_COMPAT_RAWKBD
options PCKBD_LAYOUT="(KB_SV | KB_NODEAD)"
options PCDISPLAY_SOFTCURSOR
<skipped devices>
pseudo-device crypto
pseudo-device md 1
pseudo-device vnd 4
pseudo-device bpfilter 8
pseudo-device ipfilter
pseudo-device loop
pseudo-device ppp 8
pseudo-device tap
pseudo-device tun 2
pseudo-device gif 4
pseudo-device vlan
pseudo-device pty
pseudo-device rnd
pseudo-device clockctl
pseudo-device wsmux
pseudo-device wsfont
pseudo-device ksyms
I can provide dmesg output if needed.
Anything else I could provide or test to help get this problem fixed?
>How-To-Repeat:
At the moment, this is very repeatable, as the system goes down as soon as I get it up. No idea of the cause, so don't know if this remains (assuming there is a NFS client sending bad data, that decides to stop sending bad data).
>Fix:
Turn off NFS services to keep the system up. No known fix for keeping NFS services going, though.