Subject: Re: Randomly crashing DECstation 5000/125 with NetBSD 1.5
To: None <port-pmax@netbsd.org>
From: Alexander Schreiber <alexander.schreiber@informatik.tu-chemnitz.de>
List: port-pmax
Date: 02/12/2001 06:40:44
On Mon, 12 Feb 2001, Chris Tribo wrote:

> on 2/11/01 7:54 PM, Alexander Schreiber at
> alexander.schreiber@informatik.tu-chemnitz.de wrote something like:
> 
> > I strongly suspect this to be a hardware problem and I hope to find some
> > people here with some knowledge of these machines - finding good documentation
> > for such a machine seems to be quite a challenge unfortunately :-(
> 
>     Yes, there isn't any online AFAICT for any machine newer than the
> 5000/200.

Sometimes one can gather little bits of information from random usenet and
mailing list articles ... but even googling around doesn't find much :-(
But I dutyfully store and collect anything relevant that I find,

> > Point 3 leads me to believe that maybe this is a heat problem. But so far
> > the only really hot part in the system seems to be CPU module.
> 
>     How hot? Hot enough that you can't leave your finger on the heat sink
> for more than a second?

No, not that hot. The thermal problem idea is more grasping for straws 
than anything. Jugding from the overall impression of the machine, it is
very well engineered - _very_ different (on the good side) to current
Intel PC trash. So I guess heat problems will only crop up if either one or
more of the fans fail or the airflow gets blocked by stupid messing 
around with the machine.

> service. I assume you have the case on normally, and all the fans are
> spinning.

Correct. And the air coming out of the machine is actually not even warm.

> Ultrix has a function to detect CPU overheats, but it hasn't been
> implemented in NetBSD AFAIK.

I noticed the word ''overtemperature'' flashing by during boot-up selftest,
so I suppose there is something like a overheat guard in the machine.

> > Oh - and I found out that, opposed to the NetBSD documentation about this,
> > you _can_ mix 2 MB and 8 MB memory modules. I put in 8 2 MB modules first,
> 
>     AFAIK, It does not support mixed memory modules. The hardware in the
> machine itself does, but NetBSD does not (yet) have this implemented, even
> though the code has been written more than once. I would try removing the
> mixed modules and see if it still occurs. ATM, the NetBSD kernel can't check
> for what size modules are in what slots, so it probably ends up assuming
> that all the chips are the same size. (which they are not), so when it gets
> to a different sized chip, the addressing goes screwy.

Hmm. The PROM as well as NetBSD found 20 MB of RAM with the mixed setup.
Currently, the machine is running with a single type setup (only 8x 2 MB
in the system), but still crashing randomly. I'm pretty sure it's crashed
again when I get up today, so after _that_ crash I will swap memory modules
and put in the original 2x 8 MB instead of the current 8x 2 MB to find
out wether this crashes too.

>     Did you run "test" (at least once) at the >> to verify that the chips
> and hardware is passing their self tests consistently? That *should*
> eliminate bad RAM as a culprit since it has a fairly good RAM test.

I remember running it as the first thing I did after I started playing with
the machine. Thanks for the suggestion, I will run it again with the
current setup and report back the results.

Regards,
      Alex.

-- 
------------------------------------------------------------------------------ 
 EMail : als@thangorodrim.de              | WWW : http://www.thangorodrim.de/
 "I think there's a world market for about five computers."
         -- attr. Thomas J. Watson (Chairman of the Board, IBM), 1943