Subject: Re: kern/21423
To: Ed Ravin <eravin@panix.com>
From: Chuck Silvers <chuq@chuq.com>
List: netbsd-bugs
Date: 01/08/2006 16:30:54
On Sat, Jan 07, 2006 at 05:17:35PM -0500, Ed Ravin wrote:
> On Sun, Nov 06, 2005 at 11:51:55AM -0800, Chuck Silvers wrote:
> > the first release that had that fix was 2.0.2.
> > 2.1 also had it, as will 3.0.
> 
> 21423 is still with us - we just tried to reboot a NetBSD 2.0.3 box that
> had a bad NFS mount (the server was down) with "shutdown -r", and it got
> stuck in the usual place (after the "Shutting down..." message but before
> "Rebooting...").  I had to get into the kernel debugger and issue "reboot 0x4"
> to get the reboot to proceed.

ok, thanks for letting us know.


> Also, it's unclear to me how the problem in 28971 (a problem in readdir)
> could be related to the inability to unmount a bad NFS mount.

the problem in readdir was an infinite loop, and the stack trace you gave
in this PR showed a worker thread waiting for the reply to a readdir RPC.
looks pretty similar from the information you gave.

you didn't mention in the original description that the server was down,
was that also the case with the original instance of the problem that you
reported?  if so, what mount options are you using for this NFS mount?
if you didn't specify a "soft" mount (with the "-s" option), then the client
will retry forever until the server comes back up, which will cause the
behaviour you described, even when the client host is trying to unmount
for a reboot.

-Chuck