Subject: Dual SM50 troubles
To: NetBSD-sparc mailing list <port-sparc@netbsd.org>
From: Michael Lorenz <macallan@netbsd.org>
List: port-sparc
Date: 08/01/2007 14:31:45
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

now that I got a 2nd SM50 I did some debugging here. The result is 
mixed though - before a SMP kernel wouldn't even go multiuser - too 
many programs involved in the startup process would crash all over the 
place.
With my fixes in place my SS20 goes multiuser most of the time ( like, 
8 out of 10 attempts ), building a kernel with -j2 can go on for quite 
some time with compilers on both CPUs but eventually something will 
crash and at some point the machine will deadlock. The crashes 
generally indicate cacheing troubles - either segfaults in weird 
places, bus errors that don't happen when running on only one CPU, 
illegal instruction traps. All pretty much random although some 
programs ( especially sh ) seem to be more prone to crash than others.
The problem - as far as I know - is that most of the SuperSparc's SMP 
coherency stuff is handled by the external cache controller. The SM50 
doesn't have one. Just google around a bit and you'll find reports of 
similar problems on linux and Sun claims that the SM50 isn't SMP 
capable at all.
So what we'd have to do is to use xcalls to keep CPU caches in sync ( 
or at least avoid them being out of sync in places we care about ) - we 
already do that with hypersparc modules, unfortunately just recycling 
those cache ops gives a deadlock.
I have no idea how this could have worked on 2.0 unless the compiler 
was broken enough to generate enough cache thrashing to mask the 
problem.

Any additional idead & insights?

have fun
Michael
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iQEVAwUBRrDRkcpnzkX8Yg2nAQJPRgf8DSmL/PWFtQS8TrLCUr5BaFCg8tiKiZIP
VWa7O9T89Rglm5g/pQk1m5V8OLUs6T8/7d8iVzp8Ezop3MeFiJ5GCjGsXzFornkq
KGT5wjDF6dKb3EQ/paXSJvTD0Dv4cbJsEobpoBhO8vDll03CNsXYuL/iXTNmxUEl
d0BLxySaX/G40O3xAjyKAzIJ+27TQXc/5DkppV2vjZUh5TwO9UTcULZMonZRiLwB
ARbGyiEpKDQ9xGbmubCtRjaNbppVITPppx9mfJfrcyOlk+gTBgW/1zY8C+KbSZ9n
KZNinvMn/e2N96+a8rgGQovL3wjkBjFFgna5KpRtH2IviuSa9tK2CQ==
=w/+w
-----END PGP SIGNATURE-----