Subject: Re: -current panic with MP
To: Chuck Silvers <chuq@chuq.com>
From: Joseph Sarkes <jsarkes@tiac.net>
List: port-macppc
Date: 10/16/2002 20:33:01
On Wednesday, October 16, 2002, at 01:11 PM, Chuck Silvers wrote:

> hi,
>
> this is most likely due to this change:
>
> date: 2002/10/10 22:37:51;  author: matt;  state: Exp;  lines: +87 -24
> Move pte_spill calls from trap_subr to trap().  Count the number of
> "evictions" and avoide calling pmap_pte_spill if there are no evictions
> for the current pmap.  Make the ISI execption use the default exception
> code.  Remove lots of dead stuff from trap_subr.
>
> Make olink use TAILQ instead of LIST and be sorted with evicted entries
> first and resident entries last.  Make use of this knowledge to make
> pmap_pte_spill do a fast exit.
>
>
>
> but it might just be the MP version of it.
> could you try a non-MP kernel on your box?

I built a kernel with the MP line commented out, and it crashes in the 
same way
as I reported before.

> also, could you try a kernel from just before the above checkin?

The working MP kernel is 1.6I built oct 10 00:10:03 EST. As I say, I 
can still
seem to do some weird stuff to this kernel with a 1 goto 1 style 
program loop,
where the console locks up but it appears that the other processor is 
still doing
a compile (audible disk accesses continue with the preexisting pattern)
One thing that does seem to stop this noise is when the ofw console is 
scrolling.
Is there any better way to use this machine with a local terminal? I 
haven't managed
to get X working yet, Xmacppc tears and isn't useable, and I haven't 
managed to
get an config file yet for Xfree86

> you can use (eg.) "cvs up -D 2002/10/10 22:37:00 UTC" with anoncvs to
> get a specific date.
>
> -Chuck
>
>
> On Tue, Oct 15, 2002 at 10:01:34AM -0400, Joseph Sarkes wrote:
>> I don't know if this applies to single processor. but this kernel
>> was build using up to date anoncvs, and it paniced. A day ago
>> the same kernel just hung with no info as it was trying to initially
>> mount filesystems, so there is some type of change or instability.
>>
>> anyways, here is what I managed to get from a trace
>>
>> panic+174
>> trap+8ec
>> kernel MCHK trap by route_output+c
>> raw_usrreq+1e8
>> pmap_pvo_enter+244
>> interhand+7c
>> route_usrreq+194
>> sosend+728
>> soo_write+2c
>> dofilewrite+d0
>> sys_write+94
>> syscall_plain+138
>> s_sctrap+138
>>
>> I couldn't get sync to flush the disks
>>
>> I have also been able to hang my sort of stable MP kernel from about a
>> week ago with
>> a looping program while doing other things. The console hangs and I
>> can't do anything
>> but reset the machine to get the console back. I think that one
>> processor was still working
>> away at a compile, but the keyboard hangs, and the debugger is not
>> responding.
>> Perhaps somewhere there is still a non-smp-aware routine in either the
>> debugger or
>> console stuff which can't tell who to talk to?
>>
>> system is a powermac dual 1GHz G4 (older style, whatever its name is)
>>
>> Hope this helps debugging. Let me know if other tests are in order.
>