Subject: VAX 6420: KA64A EEPROM and serial number mismatch
To: None <port-vax@netbsd.org>
From: Gunther Schadow <gunther@aurora.regenstrief.org>
List: port-vax
Date: 05/27/2001 06:18:32
Hi,

I put the other machine online and that turned out to be a major 
effort. Since I still have my power cord screwed directly to my 
circuit breaker panel, I figured it would be safer and quicker to
swap the power input box between the two machines. Did that
and power came right up. However, the blowers wouldn't move, not
one of them! Heck, since both blowers were stuck I thought it
had to be a defect upstream. So I swapped the power and logic 
box too. Still no blower would move. So I screwed out the blowers,
which is a major piece in VAX mechanics (I am a pro now in 
*screwing* with VAX 6000s :-) And indeed, the other pair of 
blowers worked fine. 

Question: can one repair blowers? 

I am just lucky that they didn't dispose of the empty corpse
yet at my work place. So I'll go get the blowers out of there
too. But it's annoying to think that your VAX goes down because
the blowers fail! (And it does go down, because the heat
and airflow sensors will shut down power if the blowers don't
work.)


Then I started swapping cards around. I found out that the bus
backplanes are a little picky about getting good contact to
the cards. Hard to say whether a card is not working well or 
the slot in which it sits isn't quite holding it right.

I wanted to build a 6 processor machine, but the problem I 
ran into is that the processors are quite picky about with
whom they want to play. Even a variation in minor EEPROM
revisions 2.03/4.00 vs. 2.03/4.02 and most of all the 
serial number mismatch would not allow me to do the swapping.

Question: how can one update EEPROM revisions and the 
"branding" of the serial numbers on the KA64A processor?

I figure if people trade the processor boards in used parts
dealerships, buying those would never be useful because
you can't mix and match them! But there has to be a trick
other than to "call your field service representative",
I hope!!


I also found the matching of memory boards tricky. In 
particular the selftest would not only indicate an 
error when something was wrong with memory, but it will
throw an exception and enter a really weird state where
the whole machine can rightout hang requiring reset
or power cycling. What's the magic with the memory?
Can one use SET MEMORY/INTERLEAVE ... to make it take 
the memory it gets? I really would like to try maxing
out one machine with all the spare boards I have, but
it doesn't let me do that.

Here is the memory error I get:

#123456789 0123456789 0123456789 01234567#

F   E   D   C   B   A   9   8   7   6   5   4   3   2   1   0   NODE #
    A   A   .   M   M   M   M   M   M   .   P   P   P   P       TYP
    o   o   .   +   +   +   +   +   +   .   +   +   +   +       STF
    .   .   .   .   .   .   .   .   .   .   E   E   E   B       BPD
    .   .   .   .   .   .   .   .   .   .   +   +   +   +       ETF
    .   .   .   .   .   .   .   .   .   .   E   E   E   B       BPD

.   .   .   .   .   .   .   .   .   +   +   .   -   +   +   .   XBI D +
.   .   .   .   .   .   .   .   .   +   .   +   .   .   +   .   XBI E +
Machine Check Stack Frame:
80000011
00020014
2004A618
00000000
061AF00E
0000001F
2004A602
041F0009

PCSTS:          000008C8
PCERR:          3C0020F0
PCTAG:          40000000
BCSTS:          01800000
BCERR:          2004DE90
BCBTS:          2000003C
BCP1TS:         2001F804
BCP2TS:         2001F804
IPORT:          800003F1
XBER:           90021041
XFADR:          C0020000
RCSR:           09200011

?29 Machine check accessing memory 80000011 00020014 2004A602
Console halting after unexpected machine check or exception.
?06 Halt instruction executed in kernel mode.
    PC     = 200C0E68
    SAVPSL = 041F0600
    ISP    = 20140408


Any ideas where this comes from? The thing is, I *know* that the 
boards are good and that the bus slots are good, since a different
configuration involving those boards and slots will work. But
my memory is of two different types (128 MB and 32 MB). 
Adding 2 or 3 x 32 MB to 4 x 128 MB has never worked; why?

Thanks much,
-Gunther


-- 
Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
Medical Information Scientist      Regenstrief Institute for Health Care
Adjunct Assistent Professor        Indiana University School of Medicine
tel:1(317)630-7960                         http://aurora.regenstrief.org