Subject: Re: Recent fs instability
To: Chris G. Demetriou <cgd@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 10/27/1999 09:52:35
On Mon, Oct 25, 1999 at 06:03:39PM -0700, Chris G. Demetriou wrote:
> Manuel Bouyer <bouyer@antioche.lip6.fr> writes:
> > For record, I've got problems with my ncr board recently. Hardware was stable
> > before and I did'nt change anything. The problem is an 'assertion failed'
> > in the ncr driver, which turn out to be a null pointer. I didn't look
> > closely yet but it seems that this null pointer can't be caused by hardware
> > problem.
>
> what line was it from?
Soory for the delay, it was at home.
It's line 6731 in ncr.c * (seems to be 6733 now):
default:
/*
** lookup the ccb
*/
dsa = INL (nc_dsa);
cp = np->ccb;
while (cp && (CCB_PHYS (cp, phys) != dsa))
cp = cp->link_ccb;
assert (cp);
if (!cp)
goto out;
assert (cp == np->ncb_dma->header.cp);
if (cp != np->ncb_dma->header.cp)
goto out;
}
This seems to happen for multiple tranfers only: I can dd from the raw device
or work a bit on a mounted part without much troubles. But once there are
some dirty buffer, a 'sync' will reliably trigger the problem.
The odd part is that this failed tranfer is not noticed by the upper level:
If I keep working on this fs after this message I'm sure to get a panic from
the filesystem after a few minutes at best, and fsck shows duplicate blocks.
If I unmount the fs just after the 'assertion failed' message fsck finds
in the allocated inode/block maps.
--
Manuel Bouyer, LIP6, Universite Paris VI. Manuel.Bouyer@lip6.fr
--