Subject: 3100 SCSI disk write hang problem found and fixed!
To: None <port-pmax@NetBSD.ORG>
From: Michael L. Hitch <mhitch@lightning.oscs.montana.edu>
List: port-pmax
Date: 05/21/1997 21:52:58
  I was able to replicate the disk write hang on the DS3100 and found
the bug that was causing the driver to loop in the interrupt routine.
The fix has been committed to -current and should be available after
the next SUP update.

  The driver was not saving the DMA state when the drive disconnected
during a data output operation.  If the message in was a save data
pointer message, the driver would attempt to get the next message in
which probably would have been the disconnect message.  Because the DMA
state was not saved, the SII was not properly sequenced and the driver
would loop trying to get the next message byte.  Some disk drives
apparently disconnected without the save data pointer message, or never
disconnected in the middle of the DMA output, and didn't have any
problems.  Only drives that disconnected and sent the save data pointer
message would cause the hang.

  I also think that the MAXBSIZE change a few months ago was the
critical change that made this bug show up.  The increase of MAXBSIZE
from 16K to MAXPHYS (64K) resulted in larger DMA transfers, which caused
disconnects from certain drives (like the RZ56).  The drives apparently
were able to accept the smaller transfers with no trouble, but couldn't
accept the full 64K write.

Michael

-- 
Michael L. Hitch			mhitch@montana.edu
Computer Consultant
Information Technology Center
Montana State University	Bozeman, MT	USA