Subject: port-shark/22355: Diskfull shark without options NFS will not boot.
To: None <gnats-bugs@gnats.netbsd.org>
From: None <steve@mctavish.co.uk>
List: netbsd-bugs
Date: 08/04/2003 11:44:49
>Number:         22355
>Category:       port-shark
>Synopsis:       Diskfull shark without options NFS will not boot.
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    port-shark-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Aug 04 10:45:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Steve Woodford
>Release:        NetBSD 1.6U
>Organization:
>Environment:
System: NetBSD dungeon.mctavish.co.uk 1.6U NetBSD 1.6U (DUNGEON) #0: Mon Aug 4 09:51:32 BST 2003 steve@oor-wullie.mctavish.co.uk:/sys/arch/shark/compile/DUNGEON shark
Architecture: arm
Machine: shark
>Description:
I run a Shark which boots from an internal hard disk. This Shark does
not need NFS support, so I always disable NFS in the kernel config file.

However, a -current NetBSD/shark kernel without "options NFS" is unable
to mount its root filesystem. It gets as far as:

	boot device: wd0
	root on wd0a dumps on wd0b

And then hangs.

Interestingly, hitting RETURN a few times at this point actually helps
things along to the point where it prints "root file system type: ffs".
Keeping the RETURN key pressed (auto-repeat) results in a bit more
progress (albeit very slowly). It seems like the machine is not seeing
a disk interrupt _unless_ it also gets a serial interrupt. Breaking into
ddb(4) at this point shows the current process stuck in biowait. Hitting
RETURN seems to make the read complete, until the next request.

Quite how this could be tickled by the lack of options NFS is anyone's
guess, unless it does an splnet() which somehow unwedges interrupts at
splbio().

>How-To-Repeat:
In a diskfull shark, comment out "options NFS" in the kernel config file
(GENERIC will do nicely) and try to boot. Make sure you have an old kernel
lying around first. ;-)
>Fix:
No idea. At a guess, there's some kind of interrupt priority problem.
>Release-Note:
>Audit-Trail:
>Unformatted: