netbsd-bugs: kern/5684: vm_fault in ip

Subject: kern/5684: vm_fault in ip_reass
To: None <gnats-bugs@gnats.netbsd.org>
From: Manuel Bouyer <Manuel.Bouyer@lip6.fr>
List: netbsd-bugs
Date: 07/01/1998 16:10:33

>Number:         5684
>Category:       kern
>Synopsis:       vm_fault in ip_reass
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jul  1 07:20:01 1998
>Last-Modified:
>Originator:     bouyer@rp.lip6.fr (Manuel Bouyer)
>Organization:

LIP6, Universite Paris VI.

>Release:        NetBSD 1.3.2
>Environment:
	
System: NetBSD garfield.lip6.fr 1.3.2 NetBSD 1.3.2 (GARFIELD) #2: Fri Jun 26 09:39:11 MEST 1998 bouyer@garfield.lip6.fr:/usr/src/sys/arch/i386/compile/GARFIELD i386


>Description:
	
	I have a PC here with several NFS-exported partitions on 2 IDE disks:
	/dev/wd0a on / type ffs (local)
	/dev/wd0e on /usr type ffs (local)
	/dev/wd0g on /images type msdos (NFS exported, local)
	/dev/wd0h on /cd1 type ffs (NFS exported, local)
	/dev/wd1e on /cd2 type ffs (NFS exported, local)
	/dev/wd1f on /cd3 type ffs (NFS exported, local)
	/dev/wd1g on /archives type ffs (local)
	pid154@garfield:/auto on /auto type nfs
	hera:/comptes on /a/hera/comptes type nfs

	Today I had 3 NFS tranferts running (2 cpio -p, and one mkisofs),
	when the machine paniced. Here is the stack trace (written by hand):

	vm_fault(0xf0756e00, deadb000, 1, 0) -> 1
	kernel: page fault trap, code=0
	Stopped at _ip_reass+0x8: movl 0x8(%ebx), %eax

	_ip_reass(f0785cc0, f0703500, 0, f0101c14,4)+0x8
	_ipintr(1f, 1f, 243592be, 1852ef71,a74ed382)+0x39b
	Bad frame pointer: 0xf20ebfa8

	Kernel image and core dump available on request.

	Before this panic, the machine has sent some "mb_map full" messages.
	I'll try increasing the NMBCLUSTERS to see if it helps if I get
	another panic.


>How-To-Repeat:
	
	Seems hard to reproduce. I have other NetBSD NFS servers here,
	which never paniced this way ... Maybe it's related to the
	NFS-exported msdos filesystem, but this seems hard to belive.
>Fix:
	Unknow, sorry. Sounds like an mbuf chain or the fragment queue got
	corrupted somewhere.
>Audit-Trail:
>Unformatted: