Subject: kern/21423: kernel cannot reboot when process locked in NFS I/O
To: None <gnats-bugs@gnats.netbsd.org>
From: None <eravin@panix.com>
List: netbsd-bugs
Date: 05/02/2003 01:59:10
>Number:         21423
>Category:       kern
>Synopsis:       kernel cannot reboot when process locked in NFS I/O
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri May 02 02:00:01 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Ed Ravin
>Release:        1.5.4_ALPHA
>Organization:
Panix
>Environment:
NetBSD panix6.panix.com 1.5.4_ALPHA NetBSD 1.5.4_ALPHA (PANIX-USER) #0: Mon Dec 23 23:34:58 EST 2002     root@juggler.panix.com:/devel/NO-BACKUPS/release-1.5-20020917/src/sys/arch/i386/compile/PANIX-USER i386
>Description:
A procmail process on one of our hosts that was accessing a file
over NFS got into a bad state - it could not be killed.  We decided
to reboot the machine to recover.  A message along the lines of
"syncing files...  done" appeared on the (serial) console, but the
machine did not reboot.  It was still answering pings on the network.

We issued an interrupt and got into the debugger - the stack trace
is below:

db> trace
cpu_Debugger(c0b6cea0,e6a9ce34,e6a9ce34,c0b6b038,e70b1b98) at cpu_Debugger+0x4
comintr(c0b79c00) at comintr+0xcd
Xintr4() at Xintr4+0x74
--- interrupt ---
idle(e6a9ce34) at idle+0x21
bpendtsleep(c0b90378,18,c0208460,1f4,0) at bpendtsleep
sbwait(c0b90378,0,c0b90334,c0dfac40,0) at sbwait+0x33
soreceive(c0b90334,e70b1d88,e70b1d38,e70b1d8c,0) at soreceive+0x2b3
nfs_receive(c0dfac40,e70b1d88,e70b1d8c,c0f88f00,c0dfac40) at nfs_receive+0x432
nfs_reply(c0dfac40,e749fe24,c0f99648,e70b1f30,3f9) at nfs_reply+0x52
nfs_request(e749fe24,c0f99600,10,0,c1043480) at nfs_request+0x3cf
nfs_readdirrpc(e749fe24,e70b1f30,c1043480,c901b4c4,c0d6a000) at nfs_readdirrpc+0
x6eb
nfs_doio(c901b4c4,c1043480,0) at nfs_doio+0x326
nfssvc_iod(e6a9ce34,0,c0100337,e6a9ce34,e6aa6270) at nfssvc_iod+0x157
start_nfsio(e6a9ce34) at start_nfsio+0xe


We have a crash dump if anyone wants to look at it, or wants to send
us commands to query it.

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted: