Subject: Re: MSI 6501 Dual AMD Athlon MP & 1.6 i386 MP kernel
To: None <tech-smp@NetBSD.ORG>
From: MLH <MLH@goathill.org>
List: tech-smp
Date: 09/10/2002 16:52:33
On 10 Sep 2002 10:25:01 -0500, Frank van der Linden wrote:
> On Mon, Sep 09, 2002 at 08:57:18PM +0000, MLH wrote:
>> The only problems I have had with these are that the LM driver
>> incorrectly calculates the cpu temperatures (port-i386/18205) and
>> XF86 appears to be somewhat unstable. It appears to simply lock up
>> the cpu it is running on - no core dump or anything. If XF86 locks
>> up cpu1, cpu0 can still conveniently restart the machine, but if
>> it is running on cpu0, the whole box appears to lock up. Is this
>> consistent with known status?
> 
> It's actually the first time I've heard about such an XF86 problem
> with the MP code. I'm running the MP code myself on my desktop
> system (a dual Athlon, Tyan board), and am not seeing X problems.

For a while, I thought it was a heat-related failure, because the
lm driver was indicating ~68C and ~72C cpu temperatures (the bios
indicates more like ~45C) and it was behaving almost exactly like
other athlons I have seen which were running too hot and under
stress. X running xscreensaver or just resizing windows under some
conditions seems to trigger it.

> Have you verified that this problem this not occur with normal
> kernels?

Haven't seen it do so with a sp kernel but I likely haven't run it
enough to find out. I'll try. I have two boxes to experiment with
now.

> If not, can you maybe can collect some details, using
> DDB or otherwise?

I haven't figured out how. The cpu apparently simply stops running
and I never have located evidence that the os even had a problem
or how to even begin tracking it down. No core dump or anything.
On reboot it simply fsks, like the power went down or something.
The only thing I can associate the failure with is X and graphic-intense
operations.

Thanks