Subject: Re: pthreads and SIGILL on m68k
To: Aaron J. Grier <agrier@poofygoof.com>
From: Paul Ripke <stix@stix.id.au>
List: port-m68k
Date: 11/19/2006 23:16:29
On Fri, Nov 17, 2006 at 10:45:32AM -0800, Aaron J. Grier wrote:
> On Thu, Nov 16, 2006 at 07:33:29PM +1100, Paul Ripke wrote:
> > Looking now, the PCs of the failing threads are very similar. I've
> > used PTHREAD_DEBUGLOG, but can't see anything of great meaning there -
> > anyone have any ideas, before I go digging deeper?
> 
> what is the illegal instruction that's screwing up the show?
> 
> x/i 0x0414d6ec
> x/i 0x000058e4
> x/i 0x049ff808
> 
> might give some clues.

I've lost track of which core the above was from, but here's another
one:

(gdb) thr app all bt

Thread 3 (Thread 22 ()):
#0  0x06bff744 in ?? ()

Thread 2 (LWP 1):
#0  0x040584c2 in write () from /usr/lib/libc.so.12
#1  0x04022fca in write () from /usr/lib/libpthread.so.0
#2  0x000031be in main (argc=1024, argv=0x0) at fblckgen.c:179

Thread 1 (LWP 2):
#0  0x06bff744 in ?? ()
#0  0x06bff744 in ?? ()
(gdb) x/i 0x06bff744
0x6bff744:      03277
(gdb) x/i 0x040584c2
0x40584c2 <write+4>:    bcss 0x40584b8 <writev+10>

> this is all with gcc4, right?

Nope, on netbsd-4, mac68k is still gcc 3.3.6. I'm also 99.9% sure
it's not a codegen issue. It only happens with pthreads, and only
for NLWP > 1.

While a cross-compiled named dies within seconds of startup, and
some of my home-grown pthread programs die within minutes, I'm yet
to come up with a reliable testcase. The fact that some code runs
for minutes - through the same loops - suggests there's a race
somewhere in the pthread code.

-- 
Paul