port-sparc64: Re: crash dump failing on machine with 4GB

Subject: Re: crash dump failing on machine with 4GB
To: Greg Oster <oster@cs.usask.ca>
From: Chris Ross <cross+netbsd@distal.com>
List: port-sparc64
Date: 09/29/2007 23:17:34

On Sep 29, 2007, at 22:04, Greg Oster wrote:
>> On Sep 29, 2007, at 15:19, Martin Husemann wrote:
>>> On Sat, Sep 29, 2007 at 02:53:23PM -0400, Chris Ross wrote:
>>>>   Any idea where the scsipi_xfer gets allocated or "hand- 
>>>> crafted" in
>>>> the cmd_c before esiop_cmd_end() is called?
>>>
>>> sys/dev/scsipi/sd.c:1560
>>>
>>> It has XS_CTL_NOSLEEP|XS_CTL_POLL set in xs_control and I think in
>>> this case the callout should not be touched.
>>
>>   Okay.  Cool.  So, in the case that XS_CTL_POLL is set, it would
>> make sense
>> that there isn't (or "shouldn't be" ?) a callout, right?  I'll whack
>> that into my
>> kernel, which will be the "more correct" way to stop that problem,
>> and let
>> me get back to the original problem of figuring out why the crash- 
>> dump
>> doesn't work.  :-)
>
> Try adding the line:
>
> 		callout_init(&xs->xs_callout, 0);
>
> after the line:
>
> 		xs->datalen = nwrt * sectorsize;
>
> in sd.c:sddump().
>
> That got rid of the panic for me...

   So, I guess the question here is, since you're seeing this on an ahc,
on an i386, is this a [eo]siop bug, or a scsipi bug?  Your solution  
assumes
it's a bug that affects all (or at least most) things.  It makes  
sense to me
to fix it the way Martin and Manuel suggested, where you presume based
on control bits whether the callout is "valid" or likely to be in- 
use.  However,
if that means changing all of the device drivers for most or all  
controllers,
maybe what you suggest here might be better.  I'll leave that  
decision to
people who know the scsi subsystem better than I.  But, something I  
wanted
to mention.

   Thanks....

                             - Chris