Subject: Re: SIR Reset with todays sources
To: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
From: Eduardo Horvath <eeh@NetBSD.org>
List: port-sparc64
Date: 03/23/2007 20:35:38
On Fri, 23 Mar 2007, Juergen Hannken-Illjes wrote:

> - In trap.c dopanic() I put the `DEBUGGER(type, tf);' before the first printf.
> 
>   Now got this on the console:
> 
>   kernel trap 30: data access exception
>   kernel trap 34: mem address not aligned
>   ...
>   SIR reset
> 
> - Then reimplemented some kind of trap_trace and got the appended trace.
> 
>   At least entry 122 (data fault on address 0x0b7e6000) looks suspect.
>   Corressponding source is:
> 
> 	0000000001009a70 <copyinstr>:
> 	...
> 	1009a90:       da 73 20 10     stx  %o5, [ %o4 + 0x10 ]
> 	1009a94:       9a 10 00 09     mov  %o1, %o5
> 	1009a98:       c2 ca 02 20     ldsba  [ %o0 ] #ASI_AIUS, %g1
>     ->	1009a9c:       c2 2a 40 00     stb  %g1, [ %o1 ]
> 	1009aa0:       92 02 60 01     inc  %o1

Maybe,  Depends on what it's being copied to.  Could be pageable
kernel memory, taking a protection/refcount fault, or the buffer
cache to map in a buffer cache page.

> 
> Any ideas anyone?
> 

I'm not seeing anything obviously wrong in the trace.  I notice
that you only have the trap entries instrumented.  You really should 
instrument the trap returns as well.  It's more than likely the problem is 
there.  Often what happens is the trap return code takes an MMU fault in 
an inconvenient location which causes it to lose part of the state it's 
trying to restore.  Also, add a trace point and dump the register state 
someplace safe if you can just before executing the SIR instruction.  That 
will give some insight as to what part of the machine state is stuffed up.  
Oh, and do you know which specific SIR is being hit?

Eduardo