Subject: kern/14640: kernel hangs in syncing disk
To: None <>
From: None <>
List: netbsd-bugs
Date: 11/19/2001 13:07:46
>Number:         14640
>Category:       kern
>Synopsis:       kernel hangs in syncing disks
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Nov 19 05:08:00 PST 2001
>Originator:     Michael Rauch
>Release:        1.5Y (2001/11/18)
NetBSD i386, syssrc cvs update'd at 2001/11/18 about 12:00 GMT,
custom kernel (mainly GENERIC with unneeded drivers commented out)

        The kernel hangs in a loop it won't exit after heavy disk i/o. 
	Invoking ddb is still possible (and switching virtual consoles), 
	it hangs in function sched_sync (sys/miscfs/syncfs/sync_subr.c) 
	in the first while loop (starting line 185 in rev. 1.10), executing 
	the following functions over and over:

	| `-> vn_lock
	      `-> VOP_LOCK
	          `-> genfs_lock
		      `-> lockmgr
	  `-> VOP_FSYNC
	      `-> genfs_fsync
	          `-> vflushbuf
	          `-> VOP_UPDATE
		      `-> ext2fs_update
	      `-> genfs_unlock
	          `-> lockmgr
	^	   <--'
	|      <--'

        /dev/wd0a on / type ffs (local)
        /dev/wd0e on /usr type ffs (local)
        /dev/wd0f on /windows type msdos (local)
        /dev/wd0g on /usr/src type ext2fs (local)
	mfs:118 on /tmp type mfs (asynchronous, local)

	The heavy disk i/o was on the /usr/src partition (ext2fs filesystem). 

	Trying to `sync` from within ddb I get 
            panic: lockmgr: locking against myself
	drop back into ddb and another `sync` reboots the machine. 
	Slight disk corruption can occur, although mostly fsck reports 
	no errors on the disk. 

	This problem was also found by others, see
	for the start of the thread.

	Do operations which require a lot of disk i/o. See the system suddenly