Subject: Re: SLC woes (and tape woes)
To: None <greywolf@defender.vas.viewlogic.com>
From: Greg Earle <earle@isolar.Tujunga.CA.US>
List: port-sparc
Date: 03/28/1996 01:48:16
> SPARCstation SLC
> 16 MB Memory
> 2 disks:
> 	sd1 at esp(0:1:0): SUN0327
> 	sd3 at esp(0:3:0): SUN0669
> CD-ROM	cd0 at esp(0:6:0)
> tape	st0 at esp(0:4:0): rogue, variable density
> ["actually it's an EXB-8200, but I'm not telling HIM that!"]

I *will* get around to fixing this ... if people don't mind having density
sensing used to determine what type a tape drive is ... (but hey, anything's
better than hard-wiring "set type to 0x7 which doesn't mean anything", eh?)

> Made all the filesystems under StunOS (4.1.3_U1 FWIW).

My canonical policy is to only make the root filesystem an "old" SunOS-created
one.  Once you've got the bare-bones on, you can use the NetBSD "newfs" to do
the rest.  Whether that helps or not, of course ...

> 1)	NetBSD-current960309 sees all the devices just fine unless I make
> 	an attempt to fsck anything on sd3, at which point it decides it
> 	can't write any of the superblocks.  sd1 works just fine.
> 	NetBSD refuses to mkfs any of the filesystems on sd3.  It comes
> 	back with 'wtfs: blk [sizeof(fs) - 1]: write failed: error 0'
> 	or something similar.
> 
> 2)	esp0: RESELECT: 9 bytes in FIFO!
> 	'Nuff said.  I evidently have a buggy SCSI chip or something
> 	for which there is currently no workaround that I know of.

Can I make a suggestion?  Just for fun?  Try installing NetBSD/SPARC 1.0 on
it and see what happens.  The SCSI driver on it may not be MI, but it seems
to be damn stable on my SS2+Weitek box here (modulo the extremely rare
"stuck in D state" disk wait hangs that seem to afflict both 1.0 and 1.1/1.1B).

Maybe I'm wrong but anecdotal evidence suggests that the 1.0 sd driver is more
stable (if not "better") than the one in 1.1/current.  As such, it seems like
a good baseline to work against, insofar as it might help isolate whether the
problems are in the disk driver or in the hardware ("buggy SCSI chip") ...

> 3)	dd if=/dev/nrst0 bs=20b fails with a complaint about the
> 	block size being too big for the tape drive(r).

On my freshly-built-yesterday P120, I get

scipio# dd if=/dev/nrst0 bs=20b of=/tmp/tapefile
dd: /dev/nrst0: Input/output error

(Always get this on first access?)

scipio# dd if=/dev/nrst0 bs=20b of=/tmp/tapefile
dd: /dev/nrst0: Input/output error
0+0 records in
0+0 records out
0 bytes transferred in 12 secs (0 bytes/sec)
scipio# Mar 28 01:16:30 scipio /netbsd: st0: 32768-byte record too big
Mar 28 01:16:30 scipio /netbsd: st0: 32768-byte record too big

I made the tape (level 0 dumps) with a 63k blocksize ... what is the "right"
size to use?  In SunOS, it's 63k because that's the minphys limit.  The
tape documentation in NetBSD is, uh, rather scant ...

Then I got:

scipio# mt -f /dev/rst0 rew
Mar 28 01:25:29 scipio /netbsd: st0(ahc0:5:0): illegal request, data = 02 00 00
00 00 22 fb 02 00 00 1e 00 40 00 20 01 05 00
Mar 28 01:25:29 scipio /netbsd: st0(ahc0:5:0): illegal request, data = 02 00 00
00 00 22 fb 02 00 00 1e 00 40 00 20 01 05 00
mt: /dev/rst0: rewind: Invalid argument
Mar 28 01:26:56 scipio /netbsd: st0(ahc0:5:0): Target Busy
Mar 28 01:26:56 scipio /netbsd: st0(ahc0:5:0): Target Busy

Things then begin to look even grimmer:

scipio# restore -ivf /dev/rst0
Verify tape and initialize maps
tape read error: Undefined error: 0
Mar 28 01:42:36 scipio /netbsd: st0(ahc0:5:0): no data found, requested size: 
32768 (decimal), data = 00 00 00 00 00 22 f3 b1 90 90 1e 00 40 00 20 01 05 00
Mar 28 01:42:36 scipio /netbsd: st0(ahc0:5:0): no data found, requested size: 
32768 (decimal), data = 00 00 00 00 00 22 f3 b1 90 90 1e 00 40 00 20 01 05 00

I hope I can get this to work, it's got my full system backups on it  :-)

	- Greg