Subject: Re: Recent macppc kernels hang under load
To: Chuck Silvers <chuq@chuq.com>
From: Dave Huang <khym@azeotrope.org>
List: port-macppc
Date: 08/31/2003 03:06:12
On Sat, Aug 30, 2003 at 03:58:13PM -0700, Chuck Silvers wrote:
> hi,
> 
> I take it that this used to work before 3 or 4 weeks ago...
> if so, you could binary search for the check-in that broke it.

Well, I'd get random SIGILLs, but the kernel never died... I did the
binary search, and I guess I was misremembering when I said I first
noticed the problem 3 or 4 weeks ago. Looks like it started around Aug
12... perhaps it was Matt Thomas's "cleanup/rework cpu_switch*,
switch_exit, Idle routine" in sys/arch/powerpc/powerpc?

Also, an August 13 kernel panicked with this during a build.sh -j2
distribution:

mp_save_fpu_proc{1} pid = 13887.1, fpcpu->ci_cpuid = 0
panic: mp_save_fpu_proc
Stopped in pid 13887.1 (nbmake) at netbsd:cpu_Debugger+0x10: lwz r0, r1, 0x14
db{1}> t
0xd5920ce0: at panic+18c
0xd5920da0: at mp_save_fpu_lwp+80
0xd5920dc0: at save_fpu_lwp+34
0xd5920dd0: at cpu_lwp_fork+84
0xd5920e00: at uvm_lwp+fork+9c
0xd5920e20: at newlwp+124
0xd5920e60: at fork1+51c
0xd5920ec0: at sys___vfork14+30
0xd5920ed0: at syscall_plain+13c
0xd5920f40: user SC trap #282 by 0x418ade10: srr1=0xd032
            r1=0xffffdce0 cr=0x40000042 xer=0 ctr=0x418ade08

I rebooted and ran build.sh again, and it hung the machine.

> also, could you send me the output of "ofdump -p" on this box?
> get ofdump from ftp://ftp.netbsd.org/pub/incoming/matt/ofdump.c

Sent in private email...