Subject: Re: crash dump failing on machine with 4GB
To: Chris Ross <cross+netbsd@distal.com>
From: Greg Oster <oster@cs.usask.ca>
List: port-sparc64
Date: 09/30/2007 09:12:26
Chris Ross writes:
> 
> On Sep 30, 2007, at 05:39, Manuel Bouyer wrote:
> >>   So, I guess the question here is, since you're seeing this on an  
> >> ahc,
> >> on an i386, is this a [eo]siop bug, or a scsipi bug?  Your solution
> >> assumes
> >
> > sddump() uses a static scsipi xfer; it's true that the callout isn't
> > initialized here. But see below
> >
> >> ....  I'll leave that  decision to
> >> people who know the scsi subsystem better than I.  But, something I
> >> wanted to mention.
> >
> > I don't think it's a good thing for the HBA drivers to use the  
> > callout when
> > using polled mode. Interrupts are blocked at this point, and they  
> > should deal
> > with the timeout in the poll loop. Otherwise, if the device is not  
> > responsive,
> > the driver will hang on the command forever.
> 
>    Ahh, okay.  So, Greg's solution, while it will often work, has  
> that risk, so
> should likely not be used longer term.  Then, I guess the answer is as
> you suggested for [e]siop (should osiop have it too?  oosiop?) is the
> recommendation, to not call callout_stop() from the device driver when
> (XS_CTL_POLL & xs_control).
> 
>    Thanks.  Greg?  Make sense?  :-)

My thoughts are:

1) The xs_callout field of the xs structure is not being initialized 
with the required bits before it might be used.  I'd call that a bug.

2) If it's not a good idea for other drivers to be using callouts in 
polled mode, that's a different problem, but should be fixed as well. 

I'm not an expert on this, and will leave the proper fix up to those 
who know better :)

Later...

Greg Oster