Subject: port-sparc/10148: panic from filesystem corruption for MFS /tmp
To: None <gnats-bugs@gnats.netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: netbsd-bugs
Date: 05/18/2000 07:11:10
>Number:         10148
>Category:       port-sparc
>Synopsis:       panic from filesystem corruption for MFS /tmp
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-sparc-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu May 18 07:12:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Bill Sommerfeld
>Release:        20000517
>Organization:
	none
>Environment:

NetBSD limekiller 1.4Y NetBSD 1.4Y (SIPBV6) #4: Wed May 17 20:17:59 EDT 2000     root@limekiller:/u1/src/sys/arch/sparc/compile/SIPBV6 sparc

>Description:

panic on a sparc-5 running -current as of yesterday including the new
no-context-switch MFS.

the system was running for about 5-6 hours, in the middle of a "make
build".  before the panic, the system emitted:

May 18 01:40:29 limekiller /netbsd: free inode /tmp/7936 had 1853059942 blocks


I did not have anything to capture serial console output at the time;
when I connected to the secial console, it was in ddb.

db> x/s *panicstr
deflate_window_in+0x5b78:       blkfree: bad size

The traceback is not all that interesting, except that it's one of the
dozens of "this isn't the block we're looking for" panics in ffs:

ffs_blkfree(0xf50b0528, 0x1, 0x2000, 0xf047f600, 0x0, 0x7ff) at ffs_blkfree+0x7c
ffs_indirtrunc(0x0, 0xa01401f, 0x1, 0x0, 0x0, 0x10) at ffs_indirtrunc+0x228
ffs_indirtrunc(0x0, 0xa0147f4, 0x1, 0x0, 0x1, 0x10) at ffs_indirtrunc+0x200
ffs_indirtrunc(0x0, 0xa3ff7f3, 0x1, 0x0, 0x2, 0x10) at ffs_indirtrunc+0x200
ffs_truncate(0xf5035da8, 0xffffffff, 0x0, 0xf50b05d8, 0xf50e5d48, 0xf50e5d38) at ffs_truncate+0x7d8
ufs_rmdir(0x0, 0x60, 0xf047f600, 0xf00f9ac4, 0xf01a3000, 0xf50a0cc8) at ufs_rmdir+0x218
sys_rmdir(0x0, 0xf5035da8, 0xf50e5f20, 0xf006b5d4, 0x3, 0x0) at sys_rmdir+0x128
syscall(0x89, 0xf50e5fb0, 0x0, 0x1, 0x3ec, 0x0) at syscall+0x1f4
_syscall(0x62000, 0x14, 0x10954, 0x59400, 0x2, 0xf506ff08) at _syscall+0xb8

This looks like a filesystem corruption panic, suggesting that
something in uvm_io didn't quite do the right thing.  I have a crash
dump but my guess is that it's probably not going to be all that
interesting.

As the system is pseudo-production (a 6bone router), it's currently
running without MFS /tmp

miscellaneous numerology:
(gdb) print/x 1853059942
$1 = 0x6e737366

looks like lower-case ASCII, namely, 'nssf'

>How-To-Repeat:
	?? run "make build" with MFS /tmp, possibly while someone else
	is attempting a remote backup with gtar.

>Fix:
	unknown.  
>Release-Note:
>Audit-Trail:
>Unformatted: