Subject: Re: SIR Reset with todays sources
To: Martin Husemann <martin@duskware.de>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: port-sparc64
Date: 03/24/2007 11:28:19
On Fri, Mar 23, 2007 at 11:30:09PM +0100, Juergen Hannken-Illjes wrote:
> On Fri, Mar 23, 2007 at 11:21:59PM +0100, Martin Husemann wrote:
> > On Fri, Mar 23, 2007 at 10:03:29AM +0100, Juergen Hannken-Illjes wrote:
> > > Any ideas anyone?
> > 
> > Could you try to find out what %g4 contains when we hit the SIR?
> > It should have PSTATE and ASI from the previous trap. Not sure if you already
> > documented the "current" %tstate value, if not, that would be interesting
> > too.
> > 
> > Martin
> 
> Looks like this SIR is the result of traps during traps.  After putting the
> call to `DEBUGGER(type, tf);' before the `printf("trap type 0x%x: cpu %d...'
> at line 575 of trap.c I get:
> 
> 	Starting file system checks:
> 	/dev/rsd0a: file system is clean; not checking
> 	/dev/rsd1e: file system is clean; not checking
> 	/dev/rsd1f: file system is clean; not checking
> 	/dev/rsd1g: file sykernel trap 30: data access exception
> 	stem is cleankernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	kernel trap 34: mem address not aligned
> 	
> 	SIR Reset

So I tried to get more state from `trap()' when I get the first
kernel trap 0x30.  With the appended diff I get:

    Starting file system checks:
    /dev/rsd0a: file system is clean; not checking
    /dev/rsd1e: file system is clean; not checking
    /dev/rsd1f: file system is clean; no
    Trapframe 0xe0016da0:
    tstate: 58000603        pc: 13f2d20     npc: 13f2d24    fault: 0
    kstack: 0       y: 0    pil: 10 oldpil: 10      fault: 0        tt: 30
    Globals:
    0000000000000000 0000000000000001 00000000e0018000 00000000e00d25f0
    0000000000000000 000000000000000c 0000000000000000 000000004074fce0
    outs:
    0000000000000001 0000000000000000 000000000c9a7e00 0000000000000001
    000000004090bf80 000000000000001d 00000000e00166d1 00000000013c6390
    CPUINFO_VA: 00000000e0018000
    self: 0x1       curlwp: 0xf     cpcb: 0xc9a7261 next: 0x12648e4

    Trapframe 0xe0016a30:
    tstate: 441d000601      pc: 13f7e04     npc: 13f7e08    fault: 0
    kstack: 0       y: 0    pil: 15 oldpil: 15      fault: 0        tt: 34
    Globals:
    0000000000000000 0000000000000001 0000000000000000 0000000000000000
    0000000000000001 000000000189d400 0000000000000000 000000004074fce0
    outs:
    0000000001615c00 0000000000000001 000000000000000f 00000000013f2d20
    00000000012648e4 000000000000000f 00000000e0016361 00000000013f7df0
    CPUINFO_VA: 00000000e0018000
    self: 0x1       curlwp: 0xf     cpcb: 0xc9a7261 next: 0x12648e4

and so on until

    t checking
    /dev/rsd1g: file system is clean
    SIR Reset

At least `curcpu() == CPUINFO_VA->ci_self' is bogus.
0x13f2d20 is at pmap::ctx_free

    00000000013f2d20 <ctx_free>:
    13f2d20:       9d e3 bf 40     save  %sp, -192, %sp
    13f2d24:       d2 06 20 48     ld  [ %i0 + 0x48 ], %o1
    13f2d28:       80 a2 60 00     cmp  %o1, 0

What other information could be useful?

Index: trap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/sparc64/sparc64/trap.c,v
retrieving revision 1.142
diff -p -u -2 -r1.142 trap.c
--- trap.c	4 Mar 2007 06:00:51 -0000	1.142
+++ trap.c	24 Mar 2007 10:19:34 -0000
@@ -569,5 +569,41 @@ extern void db_printf(const char * , ...
 dopanic:
 			trap_trace_dis = 1;
-
+(void)splhigh();
+printf("\n");
+printf("Trapframe %p:\ntstate: %llx\tpc: %llx\tnpc: %llx\tfault: %llx\n",
+    tf, (unsigned long long)tf->tf_tstate,
+    (unsigned long long)tf->tf_pc,
+    (unsigned long long)tf->tf_npc,
+    (unsigned long long)tf->tf_fault);
+printf("kstack: %llx\ty: %x\tpil: %d\toldpil: %d\tfault: %llx\ttt: %x\t\nGlobals:\n", 
+    (unsigned long long)tf->tf_kstack,
+    (int)tf->tf_y, (int)tf->tf_pil, (int)tf->tf_oldpil,
+    (unsigned long long)tf->tf_fault, (int)tf->tf_tt);
+printf("%016llx %016llx %016llx %016llx\n",
+    (unsigned long long)tf->tf_global[0],
+    (unsigned long long)tf->tf_global[1],
+    (unsigned long long)tf->tf_global[2],
+    (unsigned long long)tf->tf_global[3]);
+printf("%016llx %016llx %016llx %016llx\nouts:\n",
+    (unsigned long long)tf->tf_global[4],
+    (unsigned long long)tf->tf_global[5],
+    (unsigned long long)tf->tf_global[6],
+    (unsigned long long)tf->tf_global[7]);
+printf("%016llx %016llx %016llx %016llx\n",
+    (unsigned long long)tf->tf_out[0],
+    (unsigned long long)tf->tf_out[1],
+    (unsigned long long)tf->tf_out[2],
+    (unsigned long long)tf->tf_out[3]);
+printf("%016llx %016llx %016llx %016llx\n",
+    (unsigned long long)tf->tf_out[4],
+    (unsigned long long)tf->tf_out[5],
+    (unsigned long long)tf->tf_out[6],
+    (unsigned long long)tf->tf_out[7]);
+printf("CPUINFO_VA: %016llx\n", (unsigned long long)CPUINFO_VA);
+printf("self: %p\tcurlwp: %p\tcpcb: %p\tnext: %p\n",
+    ((struct cpu_info *)CPUINFO_VA)->ci_self,
+    ((struct cpu_info *)CPUINFO_VA)->ci_curlwp,
+    ((struct cpu_info *)CPUINFO_VA)->ci_cpcb,
+    ((struct cpu_info *)CPUINFO_VA)->ci_next);
 			{
 				char sbuf[sizeof(PSTATE_BITS) + 64];
-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)