Subject: RE: isp0 and raidctl problems, another crash
To: Chris Rupnik <crupnik@videotron.ca>
From: Matthew Jacob <mjacob@feral.com>
List: port-alpha
Date: 07/03/2002 10:02:39
It seems to me that something is jamming up the loop. This is a 2100 in
private loop topology- but at some point a command never completes and all
attempts to ask the f/w to abort it fail (that is, the f/w hangs when trying
to abort what should still be, wrt the f/w, an open exchange).

NetBSD-1.5.1 is pretty old. Lemme see if I can't build you a 1.5 branch
kernel- oh, actually, even better, fetch

	http://people.freebsd.org/~mjacob/netbsd.gz

which is a -CURRENT kernel but should work for you anyway and see if that
helps any.



On Wed, 3 Jul 2002, Chris Rupnik wrote:

>  Oh well,
>   Happened again last night. The first time it happened around 7:20am, this
>  time at 1:10am. Nothing running at the time.
>  Any ideas? I'm planning a 1.6 upgrade anyhow. So, if that is the fix, so be
>  it :)
> 
>  Chris
> 
>  isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
> isp0: Polled Mailbox Command (0x15) Timeout
> isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
> raid0: IO Error. Marking /dev/sd3a as failed.
> raid0: node (Rod) returned fail, rolling backward
> raid0: DAG failure: w addr 0x17b0a0 (1552544) nblk 0x10 (16) buf
>  0xfffffe00011f2000
> isp0: Polled Mailbox Command (0x15) Timeout
> isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
> raid0: IO Error. Marking /dev/sd1a as failed.
> raid0: node (Rod) returned fail, rolling backward
> isp0: Polled Mailbox Command (0x15) Timeout
> isp0: Mailbox Command 'ABORT' failed (TIMEOUT)
> raid0: node (Rod) returned fail, rolling backward
> raid0: DAG failure: w addr 0x17b090 (1552528) nblk 0x2 (2) buf
>  0xfffffe0001812000
> Multiple disks failed in a single group! Aborting I/O operation.
> Multiple disks failed in a single group! Aborting I/O operation.
> Multiple disks failed in a single group! Aborting I/O operation.
> Multiple disks failed in a single group! Aborting I/O operation.
> [Failed to create a DAG]
> panic: raidframe error at line 455 file
>  ../../../../dev/raidframe/rf_states.c
> Stopped in raid at cpu_Debugger+0x4: ret zero,(ra)
> db> trace
> cpu_Debugger() at cpu_Debugger+0x4
> panic() at panic+0xfc
> rf_State_CreateDAG() at rf_State_CreateDAG+0x224
> rf_ContinueRaidAccess() at rf_ContinueRaidAccess+0xdc
> rf_ContinueDagAccess() at rf_ContinueDagAccess+0x22c
> DAGExecutionThread() at DAGExecutionThread+0x1cc
> esigcode() at esigcode
> --- root of call graph ---
> db>
> 
> 
> 
>