Subject: Deciphering a TLB error
To: None <port-sgimips@netbsd.org>
From: sgimips NetBSD list <sgimips@mrynet.com>
List: port-sgimips
Date: 09/09/2003 04:09:17
So, it seems I have a 100% duplicatable TLB miss.

     Status: Running
    Command: progress -zf /mnt2//binary/sets/comp.tgz tar -xepf -

--------------------------------------------------------------------------------
 69% |*************************            | 53003 KB    1.10 MB/s    00:20 ETA
trap: TLB miss (load or instr. fetch) in kernel mode
status=0xff03, cause=0x8, epc=0x88234070, vaddr=0x0
pid=6 cmd=ioflush usp=0x0 ksp=0xce901b70
Stopped in pid 6.1 (ioflush) at 0x88234070:     lw      v1,32(s1)
db> trace
88233d80+2f0 (0,8873a8c0,1,1) ra 1 sz 256
PC 0x1: not in kernel space
0+1 (0,8873a8c0,1,1) ra 0 sz 0
User-level: pid 6.1
db> ps
 PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND    WAIT
 15               8       15          0 2  0x4002    1          sysinst
 8                1        8          0 2  0x4002    1               sh    wait
 7                0        0          0 2 0x20200    1         aiodoned
>6                0        0          0 2 0x20200    1          ioflush
 5                0        0          0 2 0x20200    1           reaper  anfree
 4                0        0          0 2 0x20200    1       pagedaemon pgdaemo
 3                0        0          0 2 0x20200    1       lfs_writer lfswrit
 2                0        0          0 2 0x20200    1         scsibus0  sccomp
 1                0        1          0 2  0x4000    1             init    wait
 0               -1        0          0 2 0x20200    1          swapper schedul
 60              15       60          0 6  0x6002    1         progress
 73               1       60          0 6  0x6002    1             gzip
 74               1       60          0 6  0x6002    0              tar       *
db> 

This is on my Challenge S R5K machine.
I've replaced the memory, all the SCSI components, the power supply 
and have reseated everything possible.

Finally, I've swapped out the CPU from an INDY R5K, and voila...
The machine finally works.

So, how would I go about reading the TLB miss info?   Does it
tell me anything useful that would have saved me a bit of time
diagnosing this machine's problem?

Thanks and stuff,
-scott