Subject: port-sun3/2727: Writing to tape crashes system with 'done < 0; strategy broken' msg
To: None <gnats-bugs@NetBSD.ORG>
From: None <jari@pilvi.fi>
List: netbsd-bugs
Date: 09/01/1996 20:27:12
>Number:         2727
>Category:       port-sun3
>Synopsis:       Writing to SCSI tape crashes system with 'done < 0; strategy broken message
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Sep  1 13:35:01 1996
>Last-Modified:
>Originator:     Jari Kokko
>Organization:
	
>Release:        1.2_BETA
>Environment:
	
System: NetBSD pilvi 1.2_BETA NetBSD 1.2_BETA (PILVI) #12: Sun Aug 25 21:52:36 EET DST 1996 root@pilvi:/usr/src/sys/arch/sun3/compile/PILVI sun3

	Last sup of entire sources was done on appr. Aug 24, 1996.

>Description:
	Writing to tape drive, with dump, tar, etc, I very frequently
	get the system to crash. Symptoms are plain: the systems drops
	to the kernel debugger, printing:

	Sep  1 19:27:01 pilvi /netbsd: st0(si0:4:0): soft error (corrected), data = 00 00 00
	Sep  1 19:27:01 pilvi /netbsd: panic: done < 0; strategy broken

	I did a savecore, and:

	pilvi crash # ps -l -M netbsd.1.core  
	  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT TT       TIME COMMAND
	    0   282     0   0  10  0   448    0 wait   Is   kd    0:01.01 (sh)
	    0   289     0   6  10  0   392    0 wait   I+   kd    0:01.44 (dump)
	    0   290     0   1   2  0   400    0 netio  S+   kd    0:00.99 (dump)
	    0   291     0 3921  -5  0   392    0 -      R+   kd    0:01.68 (dump)
	    0   292     0   7  18  0   392    0 pause  S+   kd    0:01.65 (dump)
	    0   293     0   6  18  0   392    0 pause  S+   kd    0:01.43 (dump)

	My system is a Sun3/60M. It has 24MB memory, the big mono
	monitor and, perhaps most relevantly, two Sun hatbox 'mass
	storage units', both with Micropolis 1558 disks, and two other
	disks (a 200MB Seagate and a 310MB one the name of which I am
	not sure of (CDC?). These two disks are a recent addition and
	the tape problem existed before it, so I don't think they
	matter. The tape drive is in the first hatbox, and I think it
	is an Archive, I am sure it is a QIC-24 60MB drive, and it was
	shipped by Sun with the 3/60. The tape drive works (worked)
	fine under SunOS 4.1.1.

>How-To-Repeat:
	For instance:
	shutdown to single user (optional)
	dump 0ucf /dev/rst0 /dev/sd0a	and change tapes
	dump 0ucf /dev/rst0 /dev/sd0f	and change tapes
	etc.
	I don't ever get a successful dump of both root and /var done
	before I hit the problem.
>Fix:
	No idea. I have looked in the sources
	(/src/sys/kern/kern_physio.c) and I can see that not defining
	DIAGNOSTIC would prevent the panic :-)


>Audit-Trail:
>Unformatted: