Subject: kern/5879: sd doesn't have enough retries or timeout between them for busy
To: None <gnats-bugs@gnats.netbsd.org>
From: Matthew Jacob <mjacob@feral.com>
List: netbsd-bugs
Date: 07/30/1998 16:34:44
>Number:         5879
>Category:       kern
>Synopsis:       sd doesn't have enough retries or timeout between them for busy
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jul 30 16:35:00 1998
>Last-Modified:
>Originator:     
>Organization:
	Feral Software
>Release:        7/22/98
>Environment:
	
System: NetBSD nobble.feral.com 1.3F NetBSD 1.3F (ALPHA) #0: Thu Jul 30 15:46:43 PDT 1998 mjacob@nobble.feral.com:/space/isp_test/arch/alpha/compile/ALPHA alpha


>Description:
(from NAStore MSS3 bug NetBSD/32):

        Something occurred with some of the megadrives that caused
them to return BUSY status. This caused the ahc driver to mark this
in error, but neither the main SCSI default error handler nor the
sd driver (if it intercepts this error) cope with this, and they
return an error which caused user applications to die.

Jul 28 09:23:01 carl /netbsd: sd15(ahc5:4:1): Target Busy
Jul 28 09:23:01 carl /netbsd: ccd0: error 5 on component 5
Jul 28 09:23:01 carl /netbsd: sd15(ahc5:4:1): Target Busy
Jul 28 09:23:01 carl /netbsd: ccd0: error 5 on component 5
Jul 28 09:23:01 carl /netbsd: sd15(ahc5:4:1): Target Busy
Jul 28 09:23:01 carl /netbsd: ccd0: error 5 on component 5
Jul 28 09:23:01 carl /netbsd: sd15(ahc5:4:1): Target Busy
Jul 28 09:23:01 carl /netbsd: ccd0: error 5 on component 5

Note that there was no time interval for the retries, and there
were only 4.

>How-To-Repeat:

Cause a disk device to go and stay busy for a couple of seconds while
doing I/O to it. You get an EIO.

>Fix:
        fix either scsipi_base.c:sc_err1 or the sd driver to retry
        busy conditions a little more thoroughly and robustly
>Audit-Trail:
>Unformatted: