Subject: Re: Recent macppc kernels hang under load
To: None <port-macppc@netbsd.org>
From: Chuck Silvers <chuq@chuq.com>
List: port-macppc
Date: 09/15/2003 09:28:04
On Mon, Sep 15, 2003 at 10:22:35AM +0100, Ian Fry wrote:
> On Sun, Aug 31, 2003 at 03:06:12AM -0500, Dave Huang wrote:
> > On Sat, Aug 30, 2003 at 03:58:13PM -0700, Chuck Silvers wrote:
> > > I take it that this used to work before 3 or 4 weeks ago...
> > > if so, you could binary search for the check-in that broke it.
> > Well, I'd get random SIGILLs, but the kernel never died... 
> 
> I've seen this too, when trying to build Mozilla on my G3 iBook - the
> build runs for maybe 10 or 15 minutes, and then then X restarts (it looks
> like the X server gets killed, rather than any of the compiler processes).
> I turned on the logsigexit sysctl and that reported SIGILL killed the
> process.

hmm, so the non-MP case has problems too.  I was hoping that it was
an MP-only bug.

SIGILL seems likely to be caused by missing icache invalidation.


> > I did the binary search, and I guess I was misremembering when I said I
> > first noticed the problem 3 or 4 weeks ago. Looks like it started around
> > Aug 12... perhaps it was Matt Thomas's "cleanup/rework cpu_switch*,
> > switch_exit, Idle routine" in sys/arch/powerpc/powerpc?
> 
> I can't remember when this started happening, but the only time I see this
> is trying to build Mozilla - just a plain 'make' is enough to trigger the
> problem for me.

ok, I'll try that as well when I get back to this.


> Is there anything I can do to help track this down? I'll try adding DEBUG
> and DIAGNOSTIC to my kernel tonight and give it another go.

this kind of low-level problem isn't covered very well by the debug code, alas.
but maybe you'll see any other symptoms that will be more enlightening.

-Chuck