Subject: Re: VAXstation 4000/90 Success!
To: Hugh Graham <hugh@openbsd.org>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: port-vax
Date: 07/19/2002 23:23:30
On Mon, 15 Jul 2002, Hugh Graham wrote:

> I did a little reading about FEPROM though, and I get the impression
> that some may allow bits to be cleared at any time, but set to 1 only
> in complete blocks.
>
> If the Cougar works this way then a fix will likely require blanking
> and rewriting the whole part, but without specific knowledge of the
> specifics (such as timing) of this operation I'm hesitant to write a
> driver to do it. If someone who actually owns one of these machines
> wants to check if the proms are socketed, and has access to a burner
> for restoring when things go wrong, then they will have a safety net
> for debugging a mopable prom rewriter.

  I checked on my 4000/90, and the flash memory is 4 PLCC chips soldered
to the board.  Unless a DEC utility to reprogram the flash can be found,
somebody's going to have to be very brave and take the risk of losing the
entire flash testing a new program.

  I was looking at the documentation for the AMD Am28F010 flash memory
(it's an older 128Kb flash memory), and if that's anything similar to what
the 4000/90 is using, the entire memory has to be erased and re-written to
correct the corruption.

> I'm still interested in receiving more prom images, to confirm once
> and for all that dz's misprobing was at fault, and to spot any further
> corruption in non obvious places. I'll continue to provide the images
> should anyone with more familiarity with FEPROM want to work on the
> problem, and it's still possible that some magic write / read / write
> combo remains to be discovered.

  I was trying to be very careful not to boot an unpatched 1.5.x kernel,
but goofed once and clobbered the prom on mine.  It's a 1.4 version, and
it shows only the one byte zeroed in the same location as the other
corrupted image you have.

  Also, in looking at the dz probe routine and comparing it with the AMD
Am28F010 programming, it matches up with programming the one byte that is
modified.  The probe does a write of 0x4020 to the dz csr (0x200a0000),
followed by a write of 0x0001 to the dz tcr (0x200a0008).  Based on the
Am28F010 information, the 0x40 byte in the first write is the Program
Setup code, and the second write with 0x0001 would program the 0x00 byte
into the byte at location 0x200a0009.  Normally there's a 10 microsecond
delay after the program write command, follwed by a program verify command
to stop the programming operation.  I'm not sure what happens if that
program verify command doesn't get issued.

  When I had the machine apart, I tried to take a look at what the flash
memory was.  They all had stickers marked 1.4, which I was able to pull
up, but underneath them was another sticker with some other number printed
(which I would guess is probably a DEC part number).  That sticker looked
like it was more securely attached to the chip, so I didn't try pulling it
up.

  So it appears to me like we might be able to write a program that could
possibly reprogram the flash, but it could be a little risky.  I don't
know of any way to recover from erasing the prom and failing to program it
afterwords.

  Hmm, if I could figure out how the checksum works, perhaps I could
modify other data bytes to make the checksum come out correct.  That might
at least stop the errors.

--
Michael L. Hitch			mhitch@montana.edu
Computer Consultant
Information Technology Center
Montana State University	Bozeman, MT	USA