netbsd-bugs: kern/8506: panic: lfs_write

Subject: kern/8506: panic: lfs_write_inode: negative bytes
To: None <gnats-bugs@gnats.netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 09/28/1999 05:50:58
>Number:         8506
>Category:       kern
>Synopsis:       panic: lfs_write_inode: negative bytes
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Sep 28 05:50:01 1999
>Last-Modified:
>Originator:     Manuel Bouyer
>Organization:
	myself
>Release:        -current as of a week ago
>Environment:
	
System: NetBSD chassiron.ensta.fr 1.4K NetBSD 1.4K (GENERIC) #0: Tue Sep 21 13:40:16 MEST 1999 bouyer@antifer.ipv6.lip6.fr:/share/cvs.netbsd.org/src/sys/arch/i386/compile/GENERIC i386


>Description:
	
	I have my /home an LFS filesystem, lfs exported. The NFS client was
	running a 'make build' when a power shutdown occured (but I've
	seen this on a standalone client too).
	When power came back, both machines went up properly. But after
	a few minutes I got a
	lfs_write_inode: negative bytes (segment 256 short by 128)
	panic: lfs_write_inode: negative bytes
	And stand hung at 'syncing disks'.
	Both server and client were idle at this time.

	Then I rebooted single-user and ran a 'fsck_lfs -n /home'. Here's
	the result:
	** /dev/rsd0g (NO WRITE)
	** Last Mounted on /home
	** Phase 0 - Check Segment Summaries
	** Phase 1 - Check Blocks and Sizes
	! INO 36090: daddr 0x80173 is in clean segment 256
	! INO 36111: daddr 0x80173 is in clean segment 256
	! INO 36122: daddr 0x80173 is in clean segment 256
	! INO 36090: daddr 0x80173 is in clean segment 256
	! INO 36111: daddr 0x80173 is in clean segment 256
	! INO 36122: daddr 0x80173 is in clean segment 256
	** Phase 2 - Check Pathnames
	** Phase 3 - Check Connectivity
	** Phase 4 - Check Reference Counts
	UNREF FILE I=33860  OWNER=bouyer MODE=100600
	SIZE=0 MTIME=Sep 21 17:47 1999 
	CLEAR? no

	UNREF FILE I=43249  OWNER=bouyer MODE=100600
	SIZE=49 MTIME=Sep 24 18:27 1999 
	CLEAR? no

	UNREF FILE I=43346  OWNER=bouyer MODE=100600
	SIZE=9321 MTIME=Sep 24 18:27 1999 
	CLEAR? no

	UNREF FILE I=43677  OWNER=bouyer MODE=100600
	SIZE=9321 MTIME=Sep 24 18:28 1999 
	CLEAR? no

	UNREF FILE I=43756  OWNER=bouyer MODE=100600
	SIZE=9321 MTIME=Sep 24 18:24 1999 
	CLEAR? no

	41271 files, 362085 used, 0 free 

	Then booted multi-user and got the panic again. This time I had
	ddb.onpanic=1, so I got a stack trace:

	lfs_writeinode + 0x465
	lfs_segwrite+0xf2
	sys_lfs_markv + 0x66d
	syscall(185)

	The kernel is in such a state I've been unable to get a core dump
	(call cpu_reboot ends with a page fault trap, call dumpsys
	doen't do anything good either :(

	rebooting and killing lfs_cleanerd avoids the panic at last
	
>How-To-Repeat:
	
	I guess: "Works on a lfs filesystem, hit the reset button.
	repeat until your LFS is corrupted enouth."

>Fix:
	
	Please :)
	workaround: killing lfs_clenerd semms to avoid the panic.
	This way it's possible to back up the LFS and newfs_lfs it again.
>Audit-Trail:
>Unformatted: