Subject: Re: UV3100 SCSI
To: None <port-vax@NetBSD.ORG>
From: Michael Sokolov <msokolov@blackwidow.soml.cwru.edu>
List: port-vax
Date: 06/15/1998 22:36:37
   Daniel A. Seagraves <DSEAGRAV@toad.xkl.com> wrote earlier:
> Nice.  init died with signal 11.
   
   He later told me that what was actually happening is that "DMA
incomplete" messages were pouring and that different programs were
occasionally getting SIGSEGV and that init(8) just happened to be one of
them. This pretty much agrees with my own experience, which I will relate
next.
   
   The way I have produced this kernel has been quite thorny. It seems to
me that the people who have designed NetBSD's netbooting mechanism have
assumed that everyone is working on a "fake" network with 10.x.x.x-style
IPs, no connection to the Internet, and no one besides themselves working
on it. Being one of the few remaining people for whom VAXen are an actual
job, rather than a hobby, my network is a real live one. (That's CWRUnet,
our campus fiberoptic network.) NetBSD's netbooting mechanism had turned
out to be completely unprepared for this. Also the only machines I could
use as boot servers were my production VAX running Ultrix (that's
blackwidow.cwru.edu, my mail server, FTP server, and everything else) and
my faculty friend's production SPARCs running SunOS. Since all of these are
production machines, disconnecting any of them from CWRUnet and dedicating
it as a NetBSD boot server is not an option. To make the long story short,
netbooting was not an option for me.
   
   What I had to do would probably sound like a nightmare to everyone else
here. First I prepared a TK50 tape with the NetBSD miniroot, just like one
would for a Q-bus system. Then I put a spare RZ23 in a KA42 system (with 16
MB of RAM), connected my TK50Z to it, and booted from my Ultrix tape. This
loaded an Ultrix kernel with a memory-based root filesystem inside it.
Since it was running in memory, I was able to pull the Ultrix tape out and
stick the NetBSD one in. Then I used dd(1) in the memory-based Ultrix root
filesystem to copy the NetBSD miniroot to partition c of the RZ23. (Since
Ultrix and NetBSD use totally different disk label formats, partition c was
my only choice.) Then finally I booted from the RZ23 and watched the NetBSD
kernel (stock v1.3) barf at me due to the lack of working SCSI support for
KA42.
   
   Then I managed to borrow a KA43 system (with 32 MB of RAM) for a week
from another department. I moved the RZ23 from the KA42 system to the KA43
one and voila! The kernel booted and mounted the miniroot filesystem. Then
I thought that I could just disklabel(8) the RZ23, dd(1) the miniroot from
partition a or c to partition b, reboot it from there, and build my real
root, /usr, and /var filesystems. Not so simple! disklabel(8) refused to
work with a kernel panic! Being determined to win, I found a way out. I
borrowed a SCSI ZIP drive from a faculty friend of mine, hooked it up to
the box (KA43), dd(1)'ed the miniroot from the RZ23 to it (partition c on
both), and rebooted it from there. Then I newfs(8)'ed the RZ23 (again
partition c) and put my root, /usr, and /var there. While Ultrix fits
happily on a single RZ23 (with 32 MB of swap space), NetBSD is hungrier. I
had to devote some space to it on blackwidow (my production KA42 system
with 24 MB of RAM and two RZ23s running Ultrix v4.00) and NFS-mount it. I
did that and was ready to compile my kernel.
   
   Upon fixing the (very obvious and silly) bug that prevented NetBSD from
doing SCSI on KA42 right (well, as right as it does on KA43), I went ahead
with compilation. It took two hours. The "DMA incomplete" messages kept
pouring (indicating that NetBSD doesn't do SCSI completely right even on
KA43 and reducing my goal from "right" to "as right as on KA43"), as did
the "le0: device timeout" messages. No one got a SIGSEGV, though, allowing
my compilation to complete successfully and re-assuring me that these
warnings wouldn't make my kernel completely useless.
   
   By sticking my new kernel on the root and rebooting (still on the KA43)
I made sure that it was (at least basically) working. This was expected,
since all of my changes were KA42/41-specific. The real breath-holding test
was to pull the RZ23 out, connect it to the KA42 system, and watch it
there. The kernel started booting, but as soon as I told it what I wanted
my root device to be (sd0c, just like on the KA43 system), it went into the
halt->restart->reboot cycle. Needless to say, I was very disappointed. I
was preparing the give the KA43 system back when, like a lightning, the
realization struck me that I had simply forgotten to make one of the
necessary changes. I put the RZ23 back into the KA43 system, booted NetBSD,
made the change I had forgotten originally, and recompiled the kernel again
(this time it went quickly as I was changing only one file).
   
   You can probably imagine my joy when I put the RZ23 in the KA42 system
again and the kernel booted just like on the KA43! I played with it for a
little, and it behaved exactly like on KA43. The same all-familiar warning
messages were pouring, but they seemed to be nothing more than warnings. I
did get SIGSEGV a couple of times (on umount), but that was it.
   
   Coming back from this lengthy detour, I see that Daniel A. Seagraves had
pretty much the same experience as I did, making me conclude that my kernel
does work. Of course, all of you are welcome to try it for yourselves.
   
   Have fun!
   
   Sincerely,
   Michael Sokolov
   Phone: 440-449-0299
   ARPA Internet SMTP mail: msokolov@blackwidow.cwru.edu
   
   P.S. In the case you have forgotten already, KA42 is the system board
for VS3100 M30/38/40/48, KA41 is a software-indistinguishable KA42
derivative for MV3100 M10/10e/20/20e, and KA43 is the system board for
VS3100 M76. Finally, in the language spoken by VMS and Ultrix developers
"KA420" means the same thing "KA42" means in the language spoken by VAX
hardware developers. This is the kind of thing that sometimes makes me
wonder if special translation dictionaries have ever been written to help
novices cross the border between DEC hardware and software. (Think "8600"
vs. "790", "DSSI" vs. "MSI", etc.)