Subject: Re: softdep-related panic two days in a row, 2.0_BETA/i386
To: None <current-users@NetBSD.org>
From: Jeff Rizzo <riz@redcrowgroup.com>
List: current-users
Date: 07/16/2004 11:28:47
Last night I was bored and decided to see if I could trigger this
panic again and get a crash dump this time.  I'm happy to report
success!

The crash dump is quite large - this is a machine with 2G of physical
RAM, but it's available, with debugging kernel, if anyone wants to
look at it:

http://lychee.tastylime.net/netbsd/PR26274/

Please only download it if you're going to look at it!  :)

I'm poking around it with gdb right now, but since I've never touched
the softdep code, and am not even really that familiar with FFS, I'm
most likely not going to find the root cause, so help is
appreciated!

+j

On Tue, Jul 13, 2004 at 03:06:28PM -0700, Jeff Rizzo wrote:
> I have filed a PR:  kern/26274 .

[snip]

> On Tue, Jul 13, 2004 at 11:39:39PM +0200, Thomas Klausner wrote:
> > On Tue, Jul 13, 2004 at 10:08:21AM -0700, Jeff Rizzo wrote:
> > > After a number of months running smoothly (in relation to softdeps, anyway),
> > > I just got a softdep-related panic for the second day in a row with
> > > a similar workload on the machine (./build.sh for -current with -j4).

[snip]

> > The backtrace looks the same, once it started with:
> > panic: allocdirect_merge ob 0 != 407264 || lbn 1 >= 12 || osize 0 != nsize 16384
> > and once with
> > panic: allocdirect_merge ob 0 != 448416 || lbn 1 >= 12 || osize 0 != nsize 16384
> > 
> > The rest is identical.
> > 
> > At the time the machine was doing a ./build.sh without -j
> > and handling consistent network-to-disk traffic of ~100KB/s.

-- 
Jeff Rizzo                                         http://www.redcrowgroup.com/