Subject: kern/8641: chs-ubc2: LFS on vnd can hang all processes
To: None <gnats-bugs@gnats.netbsd.org>
From: None <tls@rek.tjls.com>
List: netbsd-bugs
Date: 10/17/1999 21:50:56
>Number:         8641
>Category:       kern
>Synopsis:       LFS on vnd can hang all processes with UBC kernel
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 17 21:00:01 1999
>Last-Modified:
>Originator:     tls
>Organization:
	The NetBSD Project
>Release:        chs-ubc2 branch as of October 17, 1999
>Environment:
System: NetBSD rekusant.sr.tjls.com 1.4I NetBSD 1.4I (REKTORY) #2: Sun Oct 17 19:09:23 EDT 1999 tls@icmp:/home/tls/kernels/REKTORY alpha
>Description:
	With a chs-ubc2 branch kernel on my AS200, and an LFS mounted on
	a vnd on the only real filesystem on the machine (an FFS), a
 	trivial shell script caused a condition in which first processes
	accessing the LFS, then the pagedaemon, then eventually every 
	process on the machine hung.  Attempting to reboot from DDB
	triggered a machine check while executing PALcode!
>How-To-Repeat:
	On a machine with UBC and sufficient memory that the entire
	LFS may reside in the buffer cache (I do not know whether this
	is required, but it was the case on my machine), create a
	small vnd (e.g. 50MB) and newlfs it.  Mount the LFS filesystem.
	Change directories to the mount point of the new LFS and
	create a shell script similar to the one below.  Execute the
	shell script.  Wait a few minutes (probably until all 50MB have
	been touched, and the shell script's gotten ahead of the
	cleaner...) and all I/O on the machine will grind to a halt.
	Then you'll notice that other processes that aren't doing I/O
	(to the LFS or otherwise) are sleeping, too.  Eventually it
	all just stops.  I suspect the pagedaemon may be stuck
	holding some lock in the VM system and this is why even
	processes that aren't trying to touch the LFS filesystem
	eventually hang, but that's just conjecture.


#!/bin/sh

mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
mkdir f
cd f
dd if=/dev/zero of=test bs=911 count=4099 >/dev/null 2>&1
cd /lfs
rm -rf f
exec /lfs/foo.sh	# name of this shell script, obviously!


>Fix:
	Unknown.
>Audit-Trail:
>Unformatted: