Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Ultrasparc III+ kernel panic



On Tue, 24 Feb 2015, BERTRAND Joël wrote:

> matthew green a écrit :
> > > Hm.  From what I remember, f000xxxx is inside OBP.
> > 
> > that's correct :-)
> > 
> > > Instead of randomly swapping out hardware you really should try to
> > > diagnose the problem.  I'd turn on ddb and traptrace in the kernel and
> > > examine the contents of the traptrace buffer after the panic.  That should
> > > tell us the sequence of traps that caused the panic.
> > 
> > FWIW, traptrace never was updated for SMP.
> > 
> 
> 	Will there a hope to quickly have a fix to obtain traptrace in syslog
> ? I'm trying to reproduce this bug on Blade 2000 I have at home without any
> success.

Putting traptrace back in is not trivial.  It basically involves taking 
all of the traptrace code that was removed in locore.s version 1.214, 
enhancing it for SMP, and reinserting it into locore.s.  How good are your 
SPARC assembly language skills?

There are two ways to implement this.  We could use the existing traptrace 
buffer in locore.s.  The traptrace code currently loads the trap trace 
pointer, writes an entry one word at a time, and then stores the new 
pointer position.  To make this SMP friendly, it would need to calculate 
the size of the entry up front, use a CAS loop to update the pointer in 
place, and then write the contents in one go.  Doable, but the code would 
need restructuring.

The other option would be to take some space from the 64KB per-CPU private 
page.  That page currently holds the interrupt stack and cpu_info.  We 
should be able to easily steal a couple of KB from there.  Then just 
change the macro definitions for trap_trace, trap_trace_ptr, 
trap_trace_end, and trap_trace_dis to poiint somewhere relative to 
CPUINFO_VA, and carefully reinsert the code into locore.s.  Oh, and the 
ddb routines to dump the traptrace buffer need to be changed to look in 
the new locations.

I think the first option would be preferable, but potentially has more 
impact on the trap handling.  And any changes to the trap code could have 
unexpected sideffects.

Eduardo




Home | Main Index | Thread Index | Old Index