Port-mips archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: DECstation 2100: power trouble?



On Sat, 18 Apr 2026, Mouse wrote:

> So I finally gavea up on the 2100 and put the other 10 sticks of RAM
> back in.  Just on the chance, I then turned in on.  It promptly passed
> POST and netbooted!  Perhaps I was wrong about which slots were bank 0;
> I figured they were the two closest to the front of the machine, given
> the direction the RAM has to lean to be put into or taken out of the
> sockets (since those two slots are the last ones to be removed and thus
> first to be installed).

 Slot numbers are given on the PCB, on the PSU side, and I reckon are not 
in order.  Slots 1 & 2 always have to be occupied and modules installed in 
pairs.

> So I started a (diskless) rebuild of the 1.4T world.  A little over
> thirteen minutes in, though, while still doing the `make cleandir'
> step, I got (the leading "# " is the shell prompt the console was
> sitting at)
> 
> # trap: bus error (load or store) in kernel mode
> status=0x8fc34, cause=0x9000001c, epc=0x8003055c, vaddr=0xc0034800
> pid=1590 cmd=sh usp=0x7ffff9f8 ksp=0xc251b778
> Stopped in sh at        0x8003055c:     beq     a2,zero,0x8003057c
>                 bdslot: 0x80030560:     lbu     v0,0(a0)
> db> reboot
> syncing disks... done
> trap: bus error (load or store) in kernel mode
> status=0x8ff14, cause=0x1c, epc=0x80199390, vaddr=0xc0c173c4
> pid=1590 cmd=sh usp=0x7ffff9f8 ksp=0xc251b2b0
> Stopped in sh at        0x80199390:     lw      v1,0(s2)

 It would help if the kernel actually provided the bus addresses causing 
the failure, i.e. the contents of $a0 and $s2 respectively, translated to 
the corresponding physical address.  That's straightforward to do (but 
someone has to write such code of course).

 Have built-in console memory tests (e.g. `t a' I mentioned previously) 
never shown any failures?

> So I put the machine back together (I was running it with the cover and
> disk-mounting plate off, for better cooling) and have stopped fussing
> with it for now; I'm finding I don't have the frustration tolerance
> these days to handle dealing with broken hardware, especially subtly
> broken hardware.

 I think a marginal PSU might cause memory errors at load spikes.  At 
least replacing output filter capacitors ought to cause no regression.

 NB I've now located a local copy of the maintenance guide too, obtained
from: 
<https://manx-docs.org/collections/antonio/dec/MDS-1997-10/cd1/VOL001/0397.PDF>, 
should you wish to continue experimenting with your little system.

  Maciej


Home | Main Index | Thread Index | Old Index