Subject: SMP problems on macppc
To: NetBSD tech-kern mailing list <tech-kern@netbsd.org>
From: Michael Lorenz <macallan@netbsd.org>
List: tech-kern
Date: 10/05/2007 23:30:48
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--Apple-Mail-6-733368081
Content-Type: multipart/mixed; boundary=Apple-Mail-5-733367753
--Apple-Mail-5-733367753
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed
Hello,
now that SMP on macppc works again I ran into another problem -
occasionally programs would die in SIGILL, apparently at random.
Examining the resulting core files I found that they all died in
PLTs, disassembling the faults led to something like this:
li r11,something
b somewhere
li r11,somethingelse
b somewhereelse
etc.
Apparently syncing the caches didn't always happen after writing PLT
entries so the other CPU didn't see them -> boom.
I came up with the attached patch, basically it adds more paranoid
cache syncing to the poweroc-specific part of ld.elf_so. It seems to
work fine since with a patched ld.elf_so my dual G4 finished building
both KDE from pkgsrc and a complete userland ( with -j2 ) - either
would have triggered a SIGILL long before finishing.
Unfortunately this seems to be only one class of SIGILL problems -
when building tools I got a SIGILL in gcc's genattrtab which did not
come from a PLT entry. As with the other case neither the binary nor
the core file showed an illegal instruction at the fault address so I
guess there are more wrong or missing __syncicache() and the case
above is just the most common one. I've been unable to reproduce the
2nd case so far.
The patch changes two things - I am not sure both are necessary or
even useful.
If this indeed helps then it should go into 4.0 as well in one form
or another.
have fun
Michael
--Apple-Mail-5-733367753
Content-Transfer-Encoding: 7bit
Content-Type: application/octet-stream;
x-unix-mode=0644;
name=ldso.patch
Content-Disposition: attachment;
filename=ldso.patch
Index: ppc_reloc.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/arch/powerpc/ppc_reloc.c,v
retrieving revision 1.40
diff -u -w -r1.40 ppc_reloc.c
--- ppc_reloc.c 23 May 2006 16:27:41 -0000 1.40
+++ ppc_reloc.c 5 Oct 2007 23:12:00 -0000
@@ -239,7 +239,7 @@
* in _rtld_setup_pltgot() after all the entries have been
* initialized.
*/
- /* __syncicache(where - 3, 12); */
+ __syncicache(where - 3, 12);
}
return 0;
@@ -271,6 +271,7 @@
__syncicache(where, 4);
} else {
Elf_Addr *pltcall, *jmptab;
+ Elf_Word *start = where;
int N = obj->pltrelalim - obj->pltrela;
/* Entries beyond 8192 take twice as much space. */
@@ -294,7 +295,7 @@
/* b pltcall */
distance = (Elf_Addr)pltcall - (Elf_Addr)where;
*where++ = 0x48000000 | (distance & 0x03fffffc);
- __syncicache(where - 3, 12);
+ __syncicache(start, 12);
}
if (tp)
--Apple-Mail-5-733367753--
--Apple-Mail-6-733368081
content-type: application/pgp-signature; x-mac-type=70674453;
name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iQEVAwUBRwcBacpnzkX8Yg2nAQIAngf/QHr1uFBFa6hedhOFFfXB/AUqKgZS7NxB
DM036KEv33mhUVDc0m/FkCvEK9YXAvvcRSzJQvigpB57TbFT0T1VGGpP8QJLWzAv
Lt1aOhp+qnKFQMYyY5OsZdoKBT8KENk4Rc7im13EM2PuFw62zoutpxOVkYXuNTFf
Sg9oXi/eSNpO1V+QTsDkyXkFaXyXjaKnkaWCo7X+D4Jl/uKkojHrQk2d6s+Db7uD
Mv1dZb2LmckoS7Fkx2WPN0062d6IEI5LAQKVygkIkS3POT/jGXjeuhGMtu8UyliM
Sw+8jsmGPtMe2W0ZuotruZ8p/iT22eshr84oHxoNci24FrTeDKV20Q==
=uIsb
-----END PGP SIGNATURE-----
--Apple-Mail-6-733368081--