Subject: Re: SparcStation 20 SMP trouble
To: None <port-sparc@netbsd.org>
From: Malte Dehling <mdehling@math.ruhr-uni-bochum.de>
List: port-sparc
Date: 05/09/2005 16:29:10
--0F1p//8PRICkK4MW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Mon, May 09, 2005 at 03:35:02PM +0200, Bernd Sieker wrote:
> On 09.05.05, 13:40:33, Malte Dehling wrote:
> >=20
> > I dont have another SS10/20 right now, to test if its really the module=
s, but
> > it looks indeed as if something were broken. I will do some stress-test=
ing with
> > the `slightly broken' module tomorrow, just to see what happens.
> > Im still wondering why I get memory errors? Are they caused by a bad CP=
U?
>=20
> Unless they're explicitly from the SS20's ECC memory controller
> ("eccmemctl0 at mainbus0 ioaddr 0x0: version 0x0/0x1") they're
> likely to be caused by a defective Cache or Cache controller.
Last time I had a broken memory module, the errors looked like this (from t=
he
logs):=20
Jan 29 11:27:42 wks01 /netbsd: cpu0: NMI: system interrupts: 10000000
Jan 29 11:27:42 wks01 /netbsd: memory error:
Jan 29 11:27:42 wks01 /netbsd: EFSR: e621<CE,DW=3D2,SYNDROME=3De6>
Jan 29 11:27:42 wks01 /netbsd: MBus transaction: 8fffdd50<VAH=3D0,TYPE=3D5=
,SIZE=3D5,C,LOCK,VA=3Dff,S,MID=3D8>
Jan 29 11:27:42 wks01 /netbsd: address: 0x0af0a80
Jan 29 11:27:42 wks01 /netbsd: module location: J0201
(Correctable Error). I removed the module and I never got the error again.
This time I have:
memory error:
EFSR: 10002<DW=3D0,SYNDROME=3D0,ME>
MBus transaction: fc10d30<VAH=3D0,TYPE=3D3,SIZE=3D5,C,VA=3D4,S,MID=3D0>
address: 0x0f028f000
module location: ?
(unknown module location!) but before that I get this:
cpu0: NMI: system interrupts: 40080000<VME=3D0,SBUS=3D0,T,ME>
module0:
mxcc error 0x0
mxcc status 0xff1410002
mxcc reset 0x0
module1:
mxcc error 0xb304000001d6900
mxcc status 0xff1402000
mxcc reset 0x0
dump cpu0: NMI: system interrupts: 50080000<VME=3D0,SBUS=3D0,T,M,ME>
So I think its indeed something with the cache (of module1)...
> (iirc the MXCC also handles MBus communications).
What about modules without Cache? They dont have Cache Controllers. That is
were my next question comes from: Is it possible to disable cache in
NetBSD? It would still be a lot better then my previous configuration (I
had a single 50MHz module without cache.)
>=20
> If you're lucky the extended POST (diag-switch?=3Dtrue) catches them,
> but it might not.
Extended POST runs just fine, except for the `Data Access Error' after it
has finished. See:
http://dnsspam.student.utwente.nl/~mdehling/files/sys/ss20-boot1.log
http://dnsspam.student.utwente.nl/~mdehling/files/sys/ss20-boot2.log
>=20
> I've also just come acress a paragraph on "The Rough Guide to MBus Module=
s", on
> http://mbus.sunhelp.org/modules/#super
>=20
> --- quote ---
> WARNING: it has recently come to light that some SM41, SM51 and
> maybe SM71 modules appear to be specific to the Fujitsu Teamserver,
> and do not work in other systems. Unfortunately there is no known
> way to distinguish these Fujitsu-custom modules from "regular"
> ones, as both have the same "501-" part-number stickers. If you
> have any information on how to distinguish these modules, please
> email spooferman@excite.com .
> --- end quote ---
>=20
> So if it turns out your modules are from a "Fujitsu Teamserver"
> you might be out of luck.
>=20
I will try to find out.
>=20
> > ---
> > According to http://mbus.sunhelp.org/modules/index.htm, this module has
> > MXCC 3.3.=20
>=20
> Ah yes, module-info. I somehow thought the PROM would print MXCC
> revision on normal bootup ...
>=20
> >=20
> > --=20
> > Malte Dehling
> >=20
> > Mail: mdehling [at] math.ruhr-uni-bochum.de
> > Website: http://mdehling.ath.cx/
> > PGP: 2586 A3BF B438 E68E 2B85 C4EA C5A7 AD96 C865 03D2
>=20
>=20
>=20
> --=20
> Bernd Sieker
>=20
> My other computer runs NetBSD
> -- Allen Briggs
--=20
Malte Dehling
Mail: mdehling [at] math.ruhr-uni-bochum.de
Website: http://mdehling.ath.cx/
PGP: 2586 A3BF B438 E68E 2B85 C4EA C5A7 AD96 C865 03D2
--0F1p//8PRICkK4MW
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (SunOS)
iD8DBQFCf3OoxaetlshlA9IRAnn7AJ4ifdSFZa6q5p3Z4m/bKu9Pc8x/HQCfenht
lovSx7E96pdJVf8ELm6rpkg=
=ew5L
-----END PGP SIGNATURE-----
--0F1p//8PRICkK4MW--