Subject: kern/35728: repeated kernel panics: free: duplicated free (NFS-related)
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <arto@selonen.org>
List: netbsd-bugs
Date: 02/20/2007 07:30:01
>Number:         35728
>Category:       kern
>Synopsis:       repeated kernel panics: free: duplicated free (NFS-related)
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 20 07:30:00 +0000 2007
>Originator:     Arto Selonen
>Release:        NetBSD-current 4.99.11 ~20070219
>Organization:
>Environment:
NetBSD blah 4.99.11 NetBSD 4.99.11 (BLAH) #7: Mon Feb 19 14:08:40 EET 2007 blah@blah:/obj/sys/arch/i386/compile/BLAH i386

>Description:
The system is a NFS server serving 2 1TB partitions from a twelve disk
RAID array (3ware Escalade).

The system was upgraded on February 6th (after kern/35542 was fixed;
earlier history of the system can be found there) and ran without problems for roughly two weeks. Then on February 18th, it paniced
("panic: free: duplicated free"). Repeated reboots resulted
in similar panics pretty much as soon as network interface went up.
Booting to single user and turning NFS services off made system stable
(and NFS disks inaccessible).

The system was then upgraded on February 18th with whatever sources
anoncvs.fr.NetBSD.org gave (anoncvs.netbsd.org repeatedly timed out
for CVS_RSH=ssh access attempts), and then NFS services were turned
back on. After a reboot, once network interface came up, it paniced
again.

At the moment, I don't have any network traces for possible client
traffic, but I have a "db> reboot 0x104" crash dump of the latest panic,
and the following function call trace (just to give an idea of what is
going on):

multiply freed item 0xc105c000
panic: free: duplicated free
Stopped in pid 543.1 (nfsd)
db> tr
cpu_Debugger
panic
free
nfssrv_readdir
nfssvc_nfsd
sys_nfssvc
syscall_plain
--- syscall (number 155) ---

I have the following crash dump available:

-rw-------    1 root     wheel    15801101 Feb 20 09:11 netbsd.4.core.gz
-rw-------    1 root     wheel     1726624 Feb 20 09:11 netbsd.4.gz

Due to privacy issues, I can not provide those files, but I'm
willing to follow instructions on how to access them, if needed.

Kernel config can be found from kern/35542, if needed (it is dated
April 2005).

I can provide further details as needed.
>How-To-Repeat:
Not known, possibly requires certain NFS client traffic to trigger.

Currently, I can trigger this simply by enabling NFS services while
connected to the network. Within seconds I get the panic, so at the moment
repeating this is easy for me.

>Fix:
Not known. Turning NFS services off stabilizes the system,
but makes NFS disks inaccessible.