Subject: Re: kern/18013: NFS kernel crash
To: None <tnn@netilium.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 08/25/2002 16:29:02
On Sun, Aug 25, 2002 at 03:59:26PM +0200, Manuel Bouyer wrote:
> On Wed, Aug 21, 2002 at 07:52:05AM -0700, tnn@netilium.org wrote:
> > 
> > >Number:         18013
> > >Category:       kern
> > >Synopsis:       NFS kernel crash
> > >Confidential:   no
> > >Severity:       critical
> > >Priority:       medium
> > >Responsible:    kern-bug-people
> > >State:          open
> > >Class:          sw-bug
> > >Submitter-Id:   net
> > >Arrival-Date:   Wed Aug 21 07:54:00 PDT 2002
> > >Closed-Date:
> > >Last-Modified:
> > >Originator:     Tobias Nygren
> > >Release:        1.6_RC1
> > >Organization:
> > >Environment:
> > NetBSD megami 1.6_RC1 NetBSD 1.6_RC1 (GENERIC) #0: Mon Aug 19 06:28:34 UTC 2002 autobuild@tgm.daemon.org:/autobuild/i386/OBJ/autobuild/src/sys/arch/i386/com
> > pile/GENERIC i386
> > >Description:
> > I'm running a diskless Linux(2.4.19) client nfs mounted to a
> > NetBSD server. When applying (heavy) filesystem i/o,
> > such as makedev, installing packages, kernel compile
> > the server crashes with this output.
> > 
> > This particular dump is from NetBSD 1.6_BETA5,
> > but the problems exists in RC1 as well.
> > 
> > uvm_fault(0xcb185468, 0x38643000, 0, 2) -> e
> > kernel: page fault trap, code=0
> > Stopped in pid 171 (nfsd) at    pool_get+0x199: movl    %eax,     0x4(%edx)
> > db> trace
> > pool_get(c067fd40,2,0,0,cb3b9af4) at pool_get+0x199
> > pool_cache_get(c067fd00,2,80000001,d104,cb3b9af4) at pool_cache_get+0x3b
> > nfs_namei(cb3b9d10,cb3b9b00,6,c0972800,c0b18600) at nfs_namei+0x43
> > nfsrv_rename(c0b2cc00,c0972800,cb180c94,cb3b9df8,cb3b9f80) at nfsrv_rename+0x9f3
> > nfssvc_nfsd(cb3b9e50,804b720,cb180c94,c036257f,cb3b9f80) at nfssvc_nfsd+0x504
> > sys_nfssvc(cb180c94,cb3b9f80,cb3b9f78,c0377494) at sys_nfssvc+0x5e2
> > syscall_plain(1f,1f,1f,1f,bfbfdcbc) at syscall_plain+0xa7
> 
> Can you recompile a kernel with 'options DIAGNOSTIC' ?
> My alpha didn't survive more than a few minutes after I updated it to 1.6,
> it died with:
> panic: kernel diagnostic assertion "startoff < endoff || endoff == 0" failed: file "/home/src/sys/arch/alpha/compile/DISCO/../../../../miscfs/genfs/genfs_vnops.c", line 1041

My stack trace is not interesting:
panic: kernel diagnostic assertion "startoff < endoff || endoff == 0" failed: file "/home/src/sys/arch/alpha/compile/DISCO/../../../../miscfs/genfs/genfs_vnops.c", line 1041

Stopped in pid 139 (nfsd) at    cpu_Debugger+0x4:       ret     zero,(ra)
db> tr
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0x164
__assert() at __assert+0x34
genfs_putpages() at genfs_putpages+0x15c
db> 

but nfsd is definitively involved. The machine panics withing seconds
after /etc/rc.d/nfsd start.

My first attemps to get a core dump failed, trying again.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
--