Subject: SMP problems on macppc
To: NetBSD tech-kern mailing list <tech-kern@netbsd.org>
From: Michael Lorenz <macallan@netbsd.org>
List: tech-kern
Date: 10/05/2007 23:30:48
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--Apple-Mail-6-733368081
Content-Type: multipart/mixed; boundary=Apple-Mail-5-733367753


--Apple-Mail-5-733367753
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=US-ASCII;
	delsp=yes;
	format=flowed

Hello,

now that SMP on macppc works again I ran into another problem -  
occasionally programs would die in SIGILL, apparently at random.  
Examining the resulting core files I found that they all died in  
PLTs, disassembling the faults led to something like this:

li r11,something
b somewhere
li r11,somethingelse
b somewhereelse
etc.

Apparently syncing the caches didn't always happen after writing PLT  
entries so the other CPU didn't see them -> boom.
I came up with the attached patch, basically it adds more paranoid  
cache syncing to the poweroc-specific part of ld.elf_so. It seems to  
work fine since with a patched ld.elf_so my dual G4 finished building  
both KDE from pkgsrc and a complete userland ( with -j2 ) - either  
would have triggered a SIGILL long before finishing.

Unfortunately this seems to be only one class of SIGILL problems -  
when building tools I got a SIGILL in gcc's genattrtab which did not  
come from a PLT entry. As with the other case neither the binary nor  
the core file showed an illegal instruction at the fault address so I  
guess there are more wrong or missing __syncicache() and the case  
above is just the most common one. I've been unable to reproduce the  
2nd case so far.

The patch changes two things - I am not sure both are necessary or  
even useful.

If this indeed helps then it should go into 4.0 as well in one form  
or another.

have fun
Michael

--Apple-Mail-5-733367753
Content-Transfer-Encoding: 7bit
Content-Type: application/octet-stream;
	x-unix-mode=0644;
	name=ldso.patch
Content-Disposition: attachment;
	filename=ldso.patch

Index: ppc_reloc.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/arch/powerpc/ppc_reloc.c,v
retrieving revision 1.40
diff -u -w -r1.40 ppc_reloc.c
--- ppc_reloc.c	23 May 2006 16:27:41 -0000	1.40
+++ ppc_reloc.c	5 Oct 2007 23:12:00 -0000
@@ -239,7 +239,7 @@
 		 * in _rtld_setup_pltgot() after all the entries have been
 		 * initialized.
 		 */
-		/* __syncicache(where - 3, 12); */
+		__syncicache(where - 3, 12);
 	}
 
 	return 0;
@@ -271,6 +271,7 @@
 		__syncicache(where, 4);
 	} else {
 		Elf_Addr *pltcall, *jmptab;
+		Elf_Word *start = where;
 		int N = obj->pltrelalim - obj->pltrela;
 	
 		/* Entries beyond 8192 take twice as much space. */
@@ -294,7 +295,7 @@
 		/* b	pltcall	*/
 		distance = (Elf_Addr)pltcall - (Elf_Addr)where;
 		*where++ = 0x48000000 | (distance & 0x03fffffc);
-		__syncicache(where - 3, 12);
+		__syncicache(start, 12);
 	}
 
 	if (tp)

--Apple-Mail-5-733367753--

--Apple-Mail-6-733368081
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQEVAwUBRwcBacpnzkX8Yg2nAQIAngf/QHr1uFBFa6hedhOFFfXB/AUqKgZS7NxB
DM036KEv33mhUVDc0m/FkCvEK9YXAvvcRSzJQvigpB57TbFT0T1VGGpP8QJLWzAv
Lt1aOhp+qnKFQMYyY5OsZdoKBT8KENk4Rc7im13EM2PuFw62zoutpxOVkYXuNTFf
Sg9oXi/eSNpO1V+QTsDkyXkFaXyXjaKnkaWCo7X+D4Jl/uKkojHrQk2d6s+Db7uD
Mv1dZb2LmckoS7Fkx2WPN0062d6IEI5LAQKVygkIkS3POT/jGXjeuhGMtu8UyliM
Sw+8jsmGPtMe2W0ZuotruZ8p/iT22eshr84oHxoNci24FrTeDKV20Q==
=uIsb
-----END PGP SIGNATURE-----

--Apple-Mail-6-733368081--