Subject: memory or CPU bad on SS20?
To: None <email@example.com>
From: D G Teed <firstname.lastname@example.org>
Date: 09/09/2007 20:15:11
As a previous email mentioned, I get core dumps from apcupsd
occassionally. I now suspect there is a hardware or possibly
a kernel issue.
I've seen this error a couple of times in dmesg in the last week:
mxcc error 0x0
mxcc status 0xff1410002
mxcc reset 0x0
mxcc error 0xb30010014fc8080
mxcc status 0xff1402000
mxcc reset 0x0
Once the above happened during a tar/gunzip of a package make.
Today it happened during a run of memtester
with an argument of 16.
memtester identifed some errors minutes after that:
FAILURE: 0xa47b0704 != 0xa47b0504 at offset 0x00021c23.
FAILURE: 0x03ea6504 != 0x03ea6704 at offset 0x00031423.
FAILURE: 0x7d866504 != 0x7d866704 at offset 0x00039423.
A little later in the console, after further error-free progress
from memtester, there was a kernel panic:
store buffer copy-back failure at 0x65028. Retrying...
data fault: pc=0xf02b66d0 addr=0xf0d7308c sfsr=5336<PERR=2,UC,LVL=3,AT=1,FT=5,F>
panic: kernel fault
If I do test /memory from the OBP, there are no errors.
If I setenv "diag-switch? true" in boot prom
I can't see any errors from memory or other devices
while on the serial console. It always seems to boot up fine too.
Is it likely I have a faulty CPU module? My CPUs are identified as
thus on boot up:
cpu0 at mainbus0: mid 8: TMS390Z50 v0 or TMS390Z55 @ 75 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 d
cpu1 at mainbus0: mid 10: TMS390Z50 v0 or TMS390Z55 @ 75 MHz, on-chip FPU
cpu1: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 d
obio0 at mainbus0
I have a couple of others from a spare box, but they are likely
Are there suggestions on what the next step is to determine where the