Subject: Michael Sokolov Cooking: The Ultimate VAX
To: None <classiccmp@classiccmp.org, port-vax@netbsd.org>
From: Gunther Schadow <gunther@aurora.regenstrief.org>
List: port-vax
Date: 02/26/2002 18:13:23
Hello, for those of you who are not (or no longer :-) subscribed to the
Quasijarus list, you still want to read this. This is Michael Sokolov
as we all love him: Passionate and knowledgeable about VAXen, and just
that. The project sounds kinda cool, if he delivers I'll want to get it.

regards
-Gunther



Hi there,

It's me again, your faithful 4.3BSD-Quasijarus maintainer. Sorry that I've been
silent for a while... but guess what, I've got something cool, something that
may raise your hair...

So, what is it? Well, after whining and whining to myself about being deprived
of the past 10 years worth of technological advances (NVAX, the last thing from
DEC as far as classical VAXen go, came out in 1992), I've decided it's time to
stop whining and start doing something about it. We've got to bite the bullet
and design and build a new VAX that would use today's technology, be comparable
in performance to today's CPUs, and still be a fully compliant VAX. Design the
CPU, the system bus, the console and other support facilities, etc. and build
it all on a sparkling shiny board.

Since for the past 10 months I've worked for a company building single-board
computers with today's technology (PowerPC CPUs, system bus at 133 MHz, memory
SDRAM at system bus speed, I/O the fastest PCI chips they could find) and was
involved in most phases of board design and bring-up, I know now what's
involved in building such a beast and I've decided I'll give it a try. The
major difference of course will be having to design the VAX CPU instead of
having a ready-made PowerPC or MIPS or Sexium or whatever chip.

Not having six to seven digit cash to build an ASIC, I'll have to settle for
the CPU on an FPGA. As those are field-programmable, there are no big bucks to
pay someone to fab it, and there's plenty of room for debugging and
experimentation. I'll just have to take the VAX Architecture Reference Manual
and implement it, in VHDL or Verilog or somesuch. Scary, I know. But what the
heck, I'll give it a try.

Part of designing a CPU is designing the bus coming out of it. This brings us
to the next part of our VAX design: what should its system bus be? As you may
remember from discussions on this list a few years ago and from my writings
about VAX hardware on the Quasijarus WWW pages and elsewhere, in order to have
a true VAX, one must have a real VAXist system bus. It has to follow all the
VARM rules as to memory and cache coherency, VAX native memory and I/O address
spaces (512 MB each usually), interlocks, interrupt messaging, transaction
ordering, nice and orderly machine check exceptions when accessing a non-
existent address, etc.

So, what bus are we going to use? I believe that in order to be a true VAX, we
have to natively support all traditional DEC buses and peripherals with the
top-level bus meeting the above requirements. Having just PCI or somesuch would
make a very poor man's VAX. Our new VAX should of course be top of the line:
completely supercede all earlier VAXen and support everything that DEC ever
supported. This my friends means XMI. That's what DEC's last top-of-the-line
VAXen (6600 and 9000) had. XMI gives you a 14-slot backplane which you can
further expand with an XMI-to-VAXBI adapter and a VAXBI-to-UNIBUS adapter, and
between UNIBUS, BI, and XMI you've got the whole DEC universe covered. This
plus how nicely these buses work together plus the closeness of this design to
the original VAX pillars of 780 and 8600 makes me believe that XMI is the way
to go.

So, should we use XMI as our top-level system bus? Not necessarily. While
architecturally beautiful, XMI by its age alone will be a performance
bottleneck in our SuperVAX. I think even on the 6600 (NVAX-based, the last VAX
from DEC with XMI as its top bus) they were already feeling it squeeze, that's
maybe why they went from 6600 to 7000 (Alpha bus, very ugly) so quickly. (The
7000's Alpha bus makes it so ugly because the software can't directly access
XMI or any other I/O bus, just memory, as the Alpha bus doesn't support I/O
transactions, just memory.)

What we need then is our own top-level bus of new design with year 2002
performance characteristics with XMI and other buses (see below) attached to
it. The top-level bus will have CPU(s) and memory on it. The memory should of
course be modern SDRAM. To make this kosher by VAX standards this top-level bus
will have to meet all VARM requirements with all downstream buses interfaced to
it as proper VAX nexi. If it's done right, it'll be the same design as the last
Ultimate VAX from DEC had. The last Ultimate VAX from DEC was the 9000, folks.
We'll have basically the same design but on one board. Cool!

I also mentioned having other buses besides XMI and its subordinates. I'm
thinking about PCI (ideally 64-bit 66 MHz, the top) and VME. Yes, PCI is
decidedly not a DEC bus and it's very substandard compared to real VAX buses
like VAXBI or XMI, but I think we need to support it to be competitive and to
be able to take advantage of all the devices that come in the form of PCI
chips. Now support it doesn't mean make it our main bus. If it sits by the
side, doesn't stand in the way of data flowing through real DEC buses, and its
use is entirely optional, it should be OK I think. As for VME, while we should
faithfully reproduce everything that DEC made that's holy and pure, we don't
need to reproduce their screw-ups. In particular, their deliberate blocking of
third parties doesn't need to be reproduced. There's nothing wrong with
supporting an open standard multivendor bus.

The last point applies not only to VME. It applies even more so to SCSI. DEC's
stubborn refusal to support it except in low-end systems definitely doesn't
need to be reproduced in our new VAX. DEC only supported SCSI on low-end
BabyVAXen, but not on anything higher-end: not on Q-bus, not on BI, not on XMI.
As there seems to be no place for SCSI on the XMI side of our new VAX, we can
go for the obvious alternative: if we are going to have PCI in there, just
stick your favorite PCI UltraSCSI chip on the board.

So we are going to have our new fast system bus with XMI and PCI hanging off of
it, and eventually VME too. So what should this system bus be? Well, I've been
thinking about it for the past few days and I've come up with a design that I
think is sound and I think I can implement. After working for 10 months with
screaming fast PowerPCs, I've been spoiled by their 133 MHz system bus. I've
been even more spoiled by some of the chips available for this bus.

I've been particularly impressed by Galileo GT-64240 and GT-64260 system
controllers for MIPS and PowerPC respectively. So far I've only worked with
PowerPCs and GT-64260 and haven't looked at the 240 yet, but as I understand it
the MIPS and PowerPC system buses are very similar and the 240 and the 260 are
essentially the same chip with minor mods.

These system controllers are not PeeCee stuff. They are specifically designed
for high-end systems with heavy I/O requirements, which is exactly what we
would want in our VAX. Each GT-642x0 system controller includes the CPU
interface running at up to 133 MHz, an SDRAM controller running at full CPU bus
speed, 2 64-bit PCI interfaces running at up to 66 MHz, 3 10/100 Mbps Ethernet
ports, 2 high-speed serial ports running at up to I believe 55 Mbps (!), and
support for all system glue logic one can think of: ROMs, random logic devices,
I2C, you name it. All interfaces run simultaneously in parallel and are
interconnected by a "pizza arbiter", a crossbar switch with a 64-bit data path
running at 133 MHz. This is the kind of stuff I have in mind when I talk about
using year 2002 technology. I believe the GT-642x0 is the most powerful chip of
this kind available today. (The cost is a bit high by PeeCee standards but
should be fine for a high-end SuperVAX: GT-64260A costs $170 IIRC.)

After thinking about it for a while, I have come up with my VAX design based on
either GT-64240 or GT-64260 (haven't yet decided which). It took some hard
thinking, but I think I've got it right. The most challenging thing of course
is to gerrymander the MIPS/PowerPC system bus into a VAX bus, i.e., make it
meet all VARM requirements. OTOH, the benefits from using those chips are
significant: using the same system bus used in today's fast CPUs would assure
comparable performance for our VAX at least as far as the bus goes, the time to
market is reduced as some important components (the SDRAM controller, the PCI
interface, and some of the support logic) will be ready-made, the chip is among
the best one can pick for this purpose, and we'll have the Ethernet and fast
serial interfaces for free.

To understand how I'm going to use the MIPS/PowerPC system bus as the VAX main
bus and still call it a VAX, one needs to consider the board design. It'll have
this system bus interconnecting the CPU, the GT-642x0, the XMI interface chip,
and other stuff later. One nice thing about the GT-642x0 is that it's
specifically designed to support being not the only system controller, but one
of many. It doesn't mind seeing transactions on its interfaces that it doesn't
understand and that are meant for someone else. As it turns out, the MIPS/
PowerPC system bus is not that alien to the VAX, it's basically a subset of
what the VAX needs. Since it's a subset, it needs to be extended. As I'm going
to implement the CPU and the XMI interface on FPGAs, I'll have the freedom to
tweak it as I want. Of course the GT-642x0 won't understand my extensions, but
it won't need to: we need strict VARM compliance when talking to XMI, but not
necessarily to PCI, as PCI is alien to VAX anyway. Now the GT-642x0 will be the
main memory controller, so it better be VARM-compliant as far as memory goes.
At first I thought this was going to be a show-stopper, as I couldn't figure
out how I would communicate the special bus transactions for the BBCCI, BBSSI,
ADAWI, INSQHI, INSQTI, REMQHI, and REMQTI instructions. But then a solution
struck me. It's hard to explain without going too much into the PowerPC cache
coherency model, but basically the CPU can hoard a little chunk of memory (32-
byte granularity) and not let anyone else touch it until it's done with
whatever it wants to do with it. In my case the whatever will be implementing
the above 7 VAX instructions as prescribed by VARM. Voila!

(If you are reading this babbling about using a MIPS/PPC system bus in a VAX
  and wondering how is it different from VAX 7000 using an Alpha bus that I
  denounced so loudly, you're right, it isn't, except for one detail. That Alpha
  bus doesn't support transactions smaller than a full word (don't remember if
  that was 32 or 64 bits, but you get the idea), so if you wanted to dink an 8-
  bit or 16-bit register on an I/O bus, you would be out of luck. That's one of
  the main reasons they didn't support direct CPU access to XMI and other I/O on
  the 7000. The PowerPC system bus doesn't have this problem, it supports all
  transfer sizes, and the GT-642x0 chips move data of different sizes between
  their interfaces all the time.)

Since we are going to have our own top-level bus with the CPU(s), memory, and
bus adapters sitting above XMI, PCI, and VME, it'll need its own backplane of
our own design to hold the CPUs, memory modules, and bus adapters. There are
many possible designs, but here is the one I've chosen from considerations of
ease of implementation, time to market, and cost, as well as some marketing
considerations.

The backplane will be active, and will have the GT-642x0 on it. The slots on it
will be PCI, nexi, and SDRAM DIMMs. It will be mechanically compatible with an
ATX motherboard, but past the same mechanics it'll have enough differences to
assure anyone looking at it that it's not a PeeCee. Of the 7 slots allowed for
by ATX mechanics, 3 will be PCI and 4 will be nexi. The nexus slot connectors
will be of my own design. The first nexus slot will always have to have a CPU
card in it, which will be the master CPU, and the other 3 nexus slots will
support identical CPU cards (slave CPUs) and bus adapters such as XMI and VME.

The active backplane will also provide most of the system support logic,
including the console FEP (front end processor, a small cheap microprocessor
which I haven't yet selected) and the console port. Each CPU card will have
just the VAX CPU with minimum support logic on it, which will reduce the
incremental cost of adding CPUs. The FPGA-based CPU will only implement the VAX
architecture as a running processor, and will truly halt like a 780 rather than
implement halt microcode. This will certainly make it cool in a VAX hacker's
eyes and reduce the time to market by moving the halt complexity from the
harder to design FPGA into the easy to write FEP software. The CPU FPGA will
communicate halts and architectured console IPR accesses to the FEP via the
Inter-IC bus (I2C) or a similar bus.

The active backplane will also provide Ethernet (GT-based) and SCSI (PCI-based)
interfaces. (The high-speed serial interfaces of the GT-642x0 may also be
routed out to external 50-pin connectors of the type DEC has used on sync
serial options.) Thus in the minimal configuration with only one CPU card and
some SDRAM one will have a fully functional system. Since such a system will
fit entirely in an ATX enclosure and offer ready PCI expansion, it can be used
to introduce the VAX architecture and this high-end product into new markets
not previously exposed to VAXen. We'll be able to build and sell these systems
in any required quantity entirely from current parts, no dependence of
exhaustible stockpiles of old DEC parts. OTOH, since when equipped with an XMI
adapter the system will be a fully compliant VAX and will be highly compatible
with top of the line systems from DEC, this system should be acceptable even to
the most ardent DEC purist.

In the future systems can be built in other form factors if ATX is unacceptable
or if more than 4 nexus slots are needed. The backplane will still have to be
active most probably, though, because such a high-speed bus cannot have more
than a few slots, and if we want to offer something like the 14-slot XMI
backplane for our new system bus, the only way I can think of to implement it
is to have several independent bus segments each with no more than 4 slots and
interconnect them with intelligent bridges. These bridges would be active and
quite sophisticated. But that's for the future, I think for a start a single
bus segment with 4 slots in the cheapest form factor (ATX) will be fine.

Proper design of the system support logic on the active backplane will ensure
that the memory and interrupt mappings are as close as possible to those of
next of kin DEC systems. The GT-642x0 will play nicely with this as its memory
map is completely programmable, and it'll have no problem with the physical
address space being only 30 bits instead of 32 (VAX architecture limitation).

That's what I have on my mind, folks. As you can see I've put a good deal of
thought into it and I'm serious about it. The next step is to better explore
FPGA technology to see if I can match or beat the NVAX in performance. On our
side we have 10 more years of technological advances since the NVAX, but OTOH
DEC had built the NVAX as a fully custom IC rather than an FPGA. We'll see.

Now here comes the part where I'll need your help, folks. While I have
everything I'll need to design the active backplane and the CPU card (for the
CPU card I'll just need the VARM which I have and the backplane is all current
technology, no DEC specs needed) and thus I think I'll be able to deliver the
base system in the ATX box, in my mind to make the project worth doing and thus
to maximise the chances of me actually doing it, we'll need an assurance that
the XMI adapter would follow shortly. I am prepared to design the FPGA and the
board, but it'll require the XMI spec. I know it conceptually which allowed me
to come up with the conceptual design outlined here, but to actually build it I
will of course need the full spec detailing each signal and all.

So, to make this project worth doing and thus see to it that I actually do it,
you folks with spare time on your hands may want to start looking into prying
the XMI specs out of DECompaq.

-- 
Michael Sokolov					786 E MISSION AVE APT F
Programletarian Freedom Fighter			ESCONDIDO CA 92025-2154 USA
International Free Computing Task Force		Phone: +1-760-480-4575
						msokolov@ivan.Harhan.ORG (ARPA)
Let the Source be with you
Programletarians of the world, unite!

P.S. To the folks I've Cc'ed this to besides the Quasijarus list: your input
could be extremely helpful to this project and other great projects being
discussed on our list, and I would very much like to have you on the Quasijarus
list. To subscribe send a message to quasijarus-request@ivan.Harhan.ORG. TIA.