Subject: Re: Looking at kernel crash dumps with "gdb"
To: Matthias Scheler <tron@zhadum.de>
From: Eduardo Horvath <eeh@NetBSD.org>
List: port-sparc64
Date: 03/23/2005 00:51:13
On Wed, Mar 23, 2005 at 12:02:55AM +0000, Matthias Scheler wrote:
> 	Hello,
> 
> I'm trying to analyze a kernel crash dump from NetBSD-sparc64 system
> with "gdb":
> 
> tron@sheridan:/src/sys/compile/SHERIDAN>gdb netbsd.gdb 
> GNU gdb 5.3nb1
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "sparc64--netbsdelf"...target kcore 
> (gdb) target kcore /export/scratch/tron/netbsd.0.core
> #0  0xffffffffffffae98 in ?? ()
> (gdb) where
> #0  0xffffffffffffae98 in ?? ()
> #1  0x000000000119aaa4 in bpendtsleep () at /usr/src/sys/kern/kern_synch.c:495
> #2  0x000000000121c64c in uvm_scheduler () at /usr/src/sys/uvm/uvm_glue.c:521
> #3  0x0000000001176848 in main () at /usr/src/sys/kern/init_main.c:600
> #4  0x0000000001009828 in cpu_initialize ()
> can not access 0x800e5c, invalid address (800e5c)
> can not access 0x800e5c, invalid address (800e5c)
> can not access 0x800e5c, invalid address (800e5c)
> can not access 0x800e5c, invalid address (800e5c)
> #5  0x0000000000800e5c in ?? ()
> can not access 0x801008, invalid address (801008)
> can not access 0x801008, invalid address (801008)
> can not access 0x801008, invalid address (801008)
> can not access 0x801008, invalid address (801008)
> #6  0x0000000000801008 in ?? ()
> can not access 0x802018, invalid address (802018)
> can not access 0x802018, invalid address (802018)
> can not access 0x802018, invalid address (802018)
> can not access 0x802018, invalid address (802018)
> #7  0x0000000000802018 in ?? ()
> can not access 0x800054, invalid address (800054)
> can not access 0x800054, invalid address (800054)
> can not access 0x800054, invalid address (800054)
> can not access 0x800054, invalid address (800054)
> #8  0x0000000000800054 in ?? ()
> can not access 0x8, invalid address (8)
> can not access 0x8, invalid address (8)
> can not access 0x8, invalid address (8)
> can not access 0x8, invalid address (8)
> 
> That stack trace is quite bogus because the kernel crashed here:
> 
> crdata fault: pc=1064bd8 addr=408d8000
> kernel trap 30: data access exception
> Stopped in pid 810.1 (ifconfig) ate      netbsd:in6ifa_ifpforlinklocal+0xc:     
> [I'm working on PR kern/21189. So I really know where it crashes.]
> 
> Is there any trick to get a decent stack trace?

You may want to verify that kvm_sparc64 is still able to parse
the pmap page tables first.  (Funny thing is I don't remember
writing that code in the first place.  But it seems old compared
to the existing pmap.c)

The gdb sparc64 stacktrace code was... interesting.  I remember
having to do some nasty massaging.  Check to see if it works 
on userland crashdumps.  If it does, you should be able to set
the stack pointer (%o6 or %i6) to the value you get from the
panic message and try the trace command again.

Beyond that, SPARC stack traces are simple to extract by hand.
Each frame has 32 slots for registers.  Each register is either
4 or 8 bytes.  The last register (%o7) is the link register 
(return address pointer.)  The second to last register (%o6) 
is the stack pointer.  If the stack pointer is even then you
look at 32-bit registers.  If the stack pointer is odd, 
subtract 0x7ff from it and look at 64-bit registers.  

Eduardo