Subject: Re: kern/35071: panic: mpt_get_request: corrupted request free list (xfer)
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: netbsd-bugs
Date: 12/03/2006 11:10:03
The following reply was made to PR kern/35071; it has been noted by GNATS.

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Tracy Di Marco White <tjd-nb-pr@menelos.com>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org,
	gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/35071: panic: mpt_get_request: corrupted request free list (xfer)
Date: Sun, 3 Dec 2006 12:07:34 +0100

 On Sun, Dec 03, 2006 at 04:34:18AM -0600, Tracy Di Marco White wrote:
 > 
 > In message <20061202185501.GA16429@antioche.eu.org>, Manuel Bouyer writes:
 > >OK, the command resets, and later the chip says it's complete while
 > >we've already freed it. I think we should just issue a bus reset
 > >(or bus_device_reset but it's harder to do) in case of timeout, and
 > >let the controller complete the commands.
 > >
 > >Attached is a patch that attemps to implement a bus_reset function for
 > >mpt(4). You can easily test by starting some I/O (e.g dd if=/dev/rsdxd
 > >of=/dev/null bs=1m) and while it's running issue several scsictl scsibusx reset
 > >
 > >I expect to see "IOC Bus Reset Port %d" or "External Bus Reset" on console
 > 
 > I occasionally get this:
 > probe(mpt2:0:0:0): command timeout
 > mpt2: timeout on request index = 0xfe, seq = 0x00000068
 > mpt2: Status 0x80000000, Mask 0x00000001, Doorbell 0x24000000
 > mpt2: request state: On Chip
 > 
 > over and over at boot, on different controllers.
 > Now, instead, it seems to hang here instead of repeating.
 > When I get this I need to reboot anyway until I don't get it,
 > as usually whatever is on the scsi chain complaining will not
 > be found.
 
 So when we issue a bus reset the IOC doens't abort pending commands that
 it has in its queue. It's hard to understand how such rarely-used feature
 works by reverse-engineering other drivers; I'm not even sure it works
 properly in other drivers ...
 
 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --