Subject: kern/5525: timeout too short for Exabyte 820s
To: None <gnats-bugs@gnats.netbsd.org>
From: Chris Jones <cjones@clydesdale.math.montana.edu>
List: netbsd-bugs
Date: 06/01/1998 14:42:31
>Number:         5525
>Category:       kern
>Synopsis:       timeout too short for Exabyte 820s
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jun  1 13:50:01 1998
>Last-Modified:
>Originator:     Chris Jones
>Organization:
	
>Release:        <NetBSD-current source date>1.3.1
>Environment:
	
System: NetBSD clydesdale.math.montana.edu 1.3.2 NetBSD 1.3.2 (CLYDESDALE) #1: Mon Jun 1 13:17:12 MDT 1998 cjones@rupert.honors.montana.edu:/usr/src/sys/arch/i386/compile/CLYDESDALE i386


>Description:
(Please see my other pr, kern/5514.  It's probably not relevant, but there might
be a connection between the two problems.)

I've got an Exabyte Eliant 820s tape drive.  It is identified as:
<EXABYTE, EXB-85058HE-000, 0096>
.  When I try to read a tape, it takes approximately 2 and a half minutes for
the drive to stop flashing its lights and making ominous tape drive noises.  By
that time, of course, the st(4) driver has timed out.  I get:

st0(aic0:2:0): timed out
...and a few minutes later:
aic0: reselect from target 2 lun 0 with no nexus; sending ABORT

The first problem, of course, is the fact that the drive times out.  This may
be hardware, or it may be a quirk, or whatever.

The second problem is that, once the drive reaches this state, it's impossible
to access the device without rebooting the machine it's attached to.  The
process which is accessing the device ends up in state DL+, wchan physio, and
it (the process) won't go away.  Since it's got the device open, I can't access
the device from another process, either.  Based on my feeble understanding of
the SCSI system, it looks like the timeout isn't being handled properly.

>How-To-Repeat:
Not precisely sure.  I can reproduce the problem by rebooting and trying to
read the first tape in my tape changer.  I haven't yet tested whether other
tapes have the same effect in the same drive.

>Fix:
I've compiled a kernel with the three scsipi_command's in st.c altered to have
a timeout of 300 seconds.  I haven't tried that yet.
>Audit-Trail:
>Unformatted: