Subject: dual disks failing
To: None <port-sparc@NetBSD.ORG>
From: James Graham - Systems Mangler <greywolf@defender.VAS.viewlogic.com>
List: port-sparc
Date: 01/18/1996 18:08:42
Hardware:
        SS1+    28MB physmem

Kernel [config at end]:
	NetBSD strikeforce 1.1A NetBSD 1.1A (STRIKEFORCE) #1:\
	Tue Jan 16 16:57:22 PST 1996 \
	root@strikeforce:/usr/src/sys/arch/sparc/compile/STRIKEFORCE sparc


disks:
	sd1 (esp0(0:1:0)):	SUN0424	cyl 1151 alt 2 hd 9 sec 80
	sd3 (esp0(0:3:0)):	SUN0424	cyl 1151 alt 2 hd 9 sec 80

Symptoms:
	When I do a tar of any large volume, I get the following
	messages on my console (sorry this is so long -- it's the
	messages file -- or the relevant parts thereof):

Jan 18 16:43:01 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:43:59 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:00 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:01 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:01 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:02 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x12, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:02 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:44:04 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:04 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:05 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:05 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x10, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:06 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x10, dleft 400), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:07 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x10, dleft 1000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:07 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:44:23 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:38 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:38 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:39 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft c00), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:40 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x12, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:40 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:44:41 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x0, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:41 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:42 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:42 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft c00), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:43 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x10, dleft 400), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:43 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x10, dleft 1000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:44 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x10, dleft 2000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:44 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x10, dleft c00), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:45 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:44:45 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:46 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:46 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft 1400), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:47 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:48 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a68c4 (flags 0x10, dleft 1000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:48 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x10, dleft 400), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:49 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x10, dleft 1400), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:49 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a691c (flags 0x10, dleft 2000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:50 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:44:50 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:51 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:44:51 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x10, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:44:52 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x10, dleft 2000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:47:08 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:47:24 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x0, dleft 800), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:47:24 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 800), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:47:25 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:47:25 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:47:26 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a691c (flags 0x10, dleft 800), state 3, phase 257, msgpriq 0, msgout 0) AGAIN
Jan 18 16:47:26 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68f0 (flags 0x10, dleft 800), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:47:27 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x10, dleft 2000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:47:28 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x10, dleft 2000), state 3, phase 0, msgpriq 0, msgout 0) AGAIN
Jan 18 16:48:08 strikeforce /netbsd: RESELECT: 9 bytes in FIFO>esp0: illegal command: 0x12 (state 5, phase 7, prevphase 101)
Jan 18 16:48:22 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:48:22 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a68c4 (flags 0x0, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:48:23 strikeforce /netbsd: sd3(esp0:3:0): timed out (ecb 0xf85a691c (flags 0x2, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:48:24 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a68f0 (flags 0x2, dleft 800), state 3, phase 257, msgpriq 0, msgout 0)
Jan 18 16:48:24 strikeforce /netbsd: sd1(esp0:1:0): timed out (ecb 0xf85a6898 (flags 0x12, dleft 2000), state 3, phase 257, msgpriq 0, msgout 0) AGAIN

	This happened again after it panicked and attempted its fsck -p.

It appears to be having difficulties in splitting the I/O for some reason.
I unfortunately missed what the panic was, but the filesystem came back
with lots of "<block-number> DUP I=<i-number>" messages.


Kernel Config File:
# 	$NetBSD: GENERIC,v 1.19 1995/10/08 11:45:39 pk Exp $

machine		sparc

# Number of users
maxusers	16

# CPU options
options		"SUN4C"
#options	MMU_3L
#options	DDB,DEBUG,DIAGNOSTIC

# obsolete timezone spec
options		TIMEZONE=0, DST=0

# Standard system options
options		SWAPPAGER, VNODEPAGER, DEVPAGER	# paging
#options	DEBUG, DIAGNOSTIC	# extra kernel debugging
options		KTRACE			# system call tracing support
#options	KGDB			# support for kernel gdb
#options	KGDBDEV=0xc01, KGDBRATE=38400	# device & baud rate
#options		RASTERCONSOLE		# fast rasterop console
options		SYSVMSG,SYSVSEM,SYSVSHM
# Compatibility
options		"COMPAT_09"
options		"COMPAT_10"

# Filesystem options
options		FFS
options		NFSSERVER	# Sun NFS-compatible filesystem
options		NFSCLIENT	# Sun NFS-compatible filesystem
options		KERNFS		# kernel data-structure filesystem
options		FIFO		# POSIX fifo support (in all filesystems)
options		QUOTA		# fast filesystem with user and group quotas
options		MFS		# memory-based filesystem
#options		LOFS		# Loop-back filesystem
#options		FDESC		# user file descriptor filesystem
#options		UMAPFS		# uid/gid remapping filesystem
#options		LFS		# Log-based filesystem (still experimental)
#options		PORTAL		# portal filesystem (still experimental)
options		PROCFS		# /proc
options		CD9660		# ISO 9660 + Rock Ridge file system
options		UNION		# union file system

# Networking options
options		INET
options		TCP_COMPAT_42	# compatibility with 4.2BSD TCP/IP
#options	GATEWAY		# IP packet forwarding
#options	ISO		# OSI networking
#options	TPIP
#options	EON
options		COMPAT_43

options		LKM

# Options for SPARCstation hardware
options		COMPAT_SUNOS		# compatibility with SunOS binaries
#options		COMPAT_SVR4		# compatibility with SVR4 binaries

config		netbsd root on sd3 swap on sd3 and sd1 dumps on sd1

mainbus0 at root
cpu0	at mainbus0

sbus0	at mainbus0

audio0	at mainbus0
auxreg0	at mainbus0
clock0	at mainbus0
memreg0	at mainbus0
timer0	at mainbus0

zs0	at mainbus0
zs1	at mainbus0

# SBUS storage device interface
dma0	at sbus0 slot ? offset ?
esp0	at sbus0 slot 0 offset ?

# Ethernet interfaces
le0	at sbus? slot 0 offset ?

# Frame Buffers
cgsix0	at sbus0 slot ? offset ?

# Establish SCSI
scsibus0 at esp0

# GENERIC drives discarded, using standard SUN configuration
sd0	at scsibus0 target 0 lun 0
sd1	at scsibus0 target 1 lun 0
sd2	at scsibus0 target 2 lun 0
sd3	at scsibus0 target 3 lun 0
st0	at scsibus0 target 4 lun 0
st1	at scsibus0 target 5 lun 0
cd0	at scsibus0 target 6 lun 0

fdc0	at mainbus0
fd0	at fdc0

pseudo-device	loop
pseudo-device	pty	64
pseudo-device	sl	2
pseudo-device	kbd
pseudo-device	ppp	2
pseudo-device	tun	4
pseudo-device	vnd	8
pseudo-device	bpfilter 16


				--*greywolf;
--
"Yikes!" said Wile E. Coyote.
"Beep! Beep!" said the Road Runner.
"" said the SPARC, not encountering the same errors while compiling X11R5.
[ the Sun 386i was also known as the "Road Runner". ]