Subject: Re: suddenly my sparcs are crashing left, right, and centre!
To: NetBSD/sparc Discussion List <port-sparc@netbsd.org>
From: Eduardo Horvath <eeh@turbolinux.com>
List: port-sparc
Date: 04/25/2000 15:57:08
On Tue, 25 Apr 2000, Greg A. Woods wrote:

> [ On Tuesday, April 25, 2000 at 11:58:01 (-0700), Eduardo Horvath wrote: ]
> > Subject: Re: suddenly my sparcs are crashing left, right, and centre!
> >
> > We really need symbols.  You can generate them by running gdb on the
> > kernel image and using the `disassem <addr>' command on each of the above
> > pc values, or using objdump to generate a disassembly listing of the
> > kernel.
> 
> I figured as much....
> 
> BTW, there hasn't been a crash since I backed out my most recent version
> of smail -- it's almost certainly the cause...
> 
> Here's some more info:
> 
> trap type 0x7: pc=0xf010e450 npc=0xf010e454 psr=118000c0<S,PS>
> panic: alignment fault
> syncing disks... 6 6 3 done
> Frame pointer is at 0xf1a55bb0
> Call traceback:
>   pc = 0xf010c480 <cpu_reboot+196>:            call  0xf010c6d4 <dumpsys>
> 	  args = (0x0, 0x11000fe5, 0xf013e000, 0xf1a55cd0, 0xf1a55c60, 0x0, 0xf1a55c18) fp = 0xf1a55c18
>   pc = 0xf0034354 <panic+80>:                  call  0xf010c3bc <cpu_reboot>
> 	  args = (0x100, 0x0, 0x1, 0xf1a55d40, 0xf1a55cc8, 0x0, 0xf1a55c80) fp = 0xf1a55c80
>   pc = 0xf0112dd4 <trap+200>:                  call  0xf0034304 <panic>
> 	  args = (0xf0112b70, 0x100, 0x1, 0xf010e454, 0xf1a55d48, 0xf014a000, 0xf1a55ce8) fp = 0xf1a55ce8
>   pc = 0xf00064ec <slowtrap+292>:              call  0xf0112d0c <trap>
> 	  args = (0x7, 0x118000c0, 0xf010e450, 0xf1a55df0, 0xef, 0xf0142800, 0xf1a55d90) fp = 0xf1a55d90
>   pc = 0xf00df808 <vm_map_lookup_done+60>:     call  0xf0028500 <lockmgr>
> 	  args = (0x8, 0x900ea007, 0x0, 0x0, 0xf01c49f0, 0x80003f1b, 0xf1a55e40) fp = 0xf1a55e40
>   pc = 0xf0113778 <mem_access_fault+384>:      call  0xf010e3d8 <mmu_pagein>
> 	  args = (0xf08da180, 0x63000, 0x0, 0x62ffc, 0x0, 0xf1a55e94, 0xf1a55ea8) fp = 0xf1a55ea8
>   pc = 0xf000636c <normal_mem_fault+40>:       call  0xf01135f8 <mem_access_fault>
> 	  args = (0xf08e2900, 0x8080, 0x63000, 0x11ff8, 0x11400083, 0xf1a55fb0, 0xf1a55f50) fp = 0xf1a55f50
>   pc = 0x11fd0  args = (0x46, 0x62010, 0x11f64, 0x11f64, 0x11000084, 0xf1a55fb0, 0xeffff4d8) fp = 0xeffff4d8
> rebooting
> 
> (gdb says "No function contains specified address" for the last one....)

(The last address [0x11fd0] is a userland address.)

Looks like you had a double fault panic here, which also explains the lack
of core dump.  Your first kernel trap seems to be either in
vm_map_lookup_done() or lockmgr(), don't know exactly.  For that I would
need to see the trapframe (the locals in addition to the ins).

Then inside cpu_reboot() or dumpsys() you got another trap.  The PC is
0xf010e450 which is probably inside dumpsys() which starts at
0xf010c6d4.  I suppose if you disassembled all of dumpsys() we might be
able to figure out the panic in there, but that won't get you to the root
cause.  But it might get you a crashdump.

I don't know why you're not breaking into DDB.  You might want to try
breaking into DDB from the console to be sure it's enabled.  

Eduardo Horvath