Subject: SCSI trouble w/ IBM WDS-3100 disk
To: None <port-sparc@NetBSD.ORG>
From: Bert Driehuis <driehuis@indy.knoware.nl>
List: port-sparc
Date: 07/10/1995 00:08:04
Anyone seen this?

I have the weirdest of troubles with NetBSD-current on Sparc (around July
7th), with esp.c dated June 3rd (sorry, forgot to look for the RCS id...).

First, some background.

I have four disks to connected to my Tatung Compstation 40 (SPARCstation II
clone) system:

Digital RZ23 (100MB) on target 0 (contains NetBSD root + /usr)
Fujitsu 1GB on 1
IBM WDS-3100 (100MB) recently added on ID 2
built-in Seagate 425MB on ID 3 (contains SunOS 4.1.3 root + /usr)

Termination etcetera seem ok. The id behind the 100MB IBM is to set DESTDIR
to it, build an entirely new system, and when that's done, reboot the
system from disk2 and test it, and only then install over the 100MB RZ23.

SunOS sees the IBM disk just fine. I format'ed it (410 cyls, 2 alt, 4
heads, 128 sectors/track -- it all adds up), newfs'd it, used SunOS's dump
and restore to make a copy of sd0 to sd2 (that works, my boot disk is still
a level 1 FFS disk). Bootblocks with SunOS installboot and I'm all set.

Or so I think.

Next I reboot from disk2. It comes up, loads the kernel, and then hangs
when probing esp0 (can't be too specific here; I'm away from my machine and
didn't write this one down properly). Okay, I say to myself, you screwed up
the copy. So I reboot NetBSD from sd0. It fails unless I switch off the IBM
disk:

esp0(2,0): MSGIN failed, trying alt selection
esp: resetting esp0 SCSI bus
esp0: stray interrupt

Next come le0, cgthree0, fdc0 and fd0, and then it hangs hard. No L1-Stop,
just power down.

I traced the MSGIN message to esp.c. It is preceeded by an ominous comment
("hack alert"), which mentions that apparently when reaching this stage,
"the chip" (I take it to mean the esp) didn't grok a multibyte command. So
the question is: which multibyte command? And why? I fear that enabling
debugging printfs in espvar.h may cause heaps and heaps of output, so I'm
trying to avoid jumping into that pit.

Unfortunately, I don't have a working NetBSD-i386 up and running, so I
can't check out the IBM disk to see whether or not the problem is in esp.c,
or in de machine independant SCSI drivers (after all, the drive can load an
entire kernel before going limp). It also does work under BSD/OS 2.0.

I can live without the disk, but it bugs me that the sucker works on SunOS
and BSDI.

Ideas, anyone?

                                        -- Bert Driehuis

------
Bert Driehuis                 God, grant me the serenity to accept the things
driehuis@utrecht.knoware.nl   I can't change, courage to change the things I
                              can, and the wisdom to know the difference.