Subject: kern/12077: nfs panic when running du -a remotely
To: None <gnats-bugs@gnats.netbsd.org>
From: None <gendalia@iastate.edu>
List: netbsd-bugs
Date: 01/29/2001 04:40:20
>Number:         12077
>Category:       kern
>Synopsis:       nfs panic when running du -a remotely
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jan 29 04:43:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator:     Tracy Di Marco White
>Release:        NetBSD-1.5.1_ALPHA, January 27, 2001 cvs update
>Organization:
>Environment:
System: NetBSD lyra 1.5.1_ALPHA NetBSD 1.5.1_ALPHA (LYRA) #2: Sun Jan 28 11:27:0
7 CST 2001 root@lyra:/usr/src/sys/arch/i386/compile/LYRA i386

The exported filesystem is a 90GB RAID5 filesystem using FFS+softdep.

>Description:
The panic happened while running 'du -a' on the exported filesystem on a
Linux machine.  At first it ran fine, but after a short time, it slowed
down and showed I/O errors on a couple of files, at which point we could
no longer get into lyra.

Lyra showed:
panic: nfsd: locking botch in op 3
Stopped in nfsd at breakpoint +0x4: leave
db> t
breakpoint(c563dd84,c01dfd67,c563dd90,100,c563dde8) at breakpoint+0x4
cpu_debugger(c563dd90,100,c563dde8,c0286f18,c03a7824) at cpu_debugger+0x8
panic(c03a7824,3) at panic+0x73
nfssvc_nfsd(c563de30,804b3a0,c54aa32c) at nfssvc_nfsd+0x69c
sys_nfssvc(c54aa32c,c563df74,c563df6c) at sys_nfssvc+0x161b
syscall() at syscall+0x22a
--- syscall (number 155) ---
0x4809e91f:
db>

I have the crash dump, I'm not very familiar with what to do with one, but:
(gdb) target kcore netbsd.1.core
panic: nfsd: locking botch in op %d
#0  0x2 in ?? ()
(gdb) bt
#0  0x2 in ?? ()
#1  0xc0318c0b in cpu_reboot ()
#2  0xc01dfe28 in panic ()
#3  0xc0285f18 in nfssvc_nfsd ()
#4  0xc028541f in sys_nfssvc ()
#5  0xc031edee in syscall ()
#6  0xc0100db1 in syscall1 ()
can not access 0xbfbfdd54, invalid translation (invalid PDE)
can not access 0xbfbfdd54, invalid translation (invalid PDE)
Cannot access memory at address 0xbfbfdd54.

>How-To-Repeat:
I'd really prefer not to, fsck & parity rebuild takes a while.
>Fix:
none yet.
>Release-Note:
>Audit-Trail:
>Unformatted: