Subject: Some good news and some scsi news.
To: None <>
From: Ian Dall <dall@HFRD.DSTO.GOV.AU>
List: port-sun3
Date: 01/15/1996 23:01:40
Trent McNair <> writes:

  > -However- 
  > Now when I boot the miniroot I have problems.  I get alot -o- garbage 
  > that looks alot like this:

  > asTdsLKJkhlHkl :not found (failed grep stuff?)

Yes, this is like what I saw. Try doing L1 A before responding to the
prompt on which shell to use. Then set si_options to 0 and continue.
I found this is slow but works.

I have just spent the evening investigating why DMA doesn't work
for me. Here is what I did. I have a couple of utilities I wrote
called gen-seq and check-seq which I have previously found useful
debugging scsi drivers.

I repartitioned so there is a small free partition I can use for testing.
Then I boot single user using the unpatched netbsd (si_options = 3).
Extensive testing reading and writing to /dev/nrst0 and /dev/rsd2h
shows no problem. However, on a hunch, I tried writing to the
block device, /dev/sd2h. And bingo, when I try and read the
data back, it tries and tells me that /bin/dd is not an executable.
Rebooting and reading /dev/sd2h or /dev/rsd2h shows that the
disk has not in fact been corrupted (although it certainly
does get corrupted if I mount the disk rw when using dma).

Trent, I'd be interested to see if you can duplicate this
experiment since your problem looks similar.

My guess is that the write to the block device using dma is
resulting in corrupt data structures in the kernel, maybe
the block io cache. I'd say the DMA IO as such is working fine
as evidenced by the DMA to the raw device working. There is 
no evidence of any problem doing reads. One possibility is that
the DMA is writing past the end of some structure which happens
to be harmless for raw IO but not for block IO. (But you'd expect
corruption to be more likely for reads rather than writes).

Why doen't others see this (except for maybe Trent McNair's problem
which sounds similar) I don't know. Here are the things which are
in any way unusual about my configuration:

 o My 3/50 has 12 MB via a 3rd party extender.

 o My disk is at scsi ID 1, not 0.

 o I have a rather full scsi bus with 3 disks, two tapes and 3 hosts
   on it. Experiments are the same with other hosts off. While
   somewhat long (but within spec) and complex, this scsi
   configuration has been very reliable, both with SunOS on the 3/50
   and still is via the other hosts, one of which has a dp8490 which
   is a ncr5380 superset. Indeed, that is how I got the miniroot

   If there are per scsi unit data structures, it is conceivable that
   the misbehaviour depends on the scsi id.