Subject: Re: kern/35071: panic: mpt_get_request: corrupted request free list (xfer)
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Tracy Di Marco White <tjd-nb-pr@menelos.com>
List: netbsd-bugs
Date: 12/03/2006 04:34:18
In message <20061202185501.GA16429@antioche.eu.org>, Manuel Bouyer writes:
>OK, the command resets, and later the chip says it's complete while
>we've already freed it. I think we should just issue a bus reset
>(or bus_device_reset but it's harder to do) in case of timeout, and
>let the controller complete the commands.
>
>Attached is a patch that attemps to implement a bus_reset function for
>mpt(4). You can easily test by starting some I/O (e.g dd if=/dev/rsdxd
>of=/dev/null bs=1m) and while it's running issue several scsictl scsibusx reset
>
>I expect to see "IOC Bus Reset Port %d" or "External Bus Reset" on console

I occasionally get this:
probe(mpt2:0:0:0): command timeout
mpt2: timeout on request index = 0xfe, seq = 0x00000068
mpt2: Status 0x80000000, Mask 0x00000001, Doorbell 0x24000000
mpt2: request state: On Chip

over and over at boot, on different controllers.
Now, instead, it seems to hang here instead of repeating.
When I get this I need to reboot anyway until I don't get it,
as usually whatever is on the scsi chain complaining will not
be found.

-Tracy