Subject: toolchain/24098: GDB can't backtrace through Alpha kernel traps, despite intent
To: None <>
From: None <>
List: netbsd-bugs
Date: 01/14/2004 19:47:40
>Number:         24098
>Category:       toolchain
>Synopsis:       GDB can't backtrace through Alpha kernel traps, despite intent
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    toolchain-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jan 15 00:48:00 UTC 2004
>Originator:     Nathan J. Williams
>Release:        NetBSD 1.6ZH
	Massachvsetts Institvte of Technology
System: NetBSD samsung-means-to-come 1.6ZH NetBSD 1.6ZH (SAMSUNG) #57: Tue Jan 13 17:55:47 EST 2004 alpha
Architecture: alpha
Machine: alpha

Examining a core dump from a panic (see PR 24097), I noticed that the
GDB backtrace did not go back as far as the DDB backtrace.

cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
__assert() at netbsd:__assert+0x34
uvm_anfree() at netbsd:uvm_anfree+0xf8
amap_unref() at netbsd:amap_unref+0x318
uvmspace_free() at netbsd:uvmspace_free+0x268
uvm_proc_exit() at netbsd:uvm_proc_exit+0x1c
exit1() at netbsd:exit1+0x3b4
sys_exit() at netbsd:sys_exit+0x44
syscall_plain() at netbsd:syscall_plain+0xa8
XentSys() at netbsd:XentSys+0x5c
--- syscall (1) ---
--- user mode ---

#0  0xfffffc00005302d8 in dumpsys ()
    at ../../../../arch/alpha/alpha/machdep.c:1244
#1  0xfffffc0000530118 in cpu_reboot (howto=0, bootstr=0x0)
    at ../../../../arch/alpha/alpha/machdep.c:1065
#2  0xfffffc000040b1b4 in db_fncall (addr=0, have_addr=0, count=0, modif=0x0)
    at ../../../../ddb/db_command.c:666
#3  0xfffffc000040addc in db_command (last_cmdp=0xfffffc00005d6c70, 
    cmd_table=0xfffffc00005d68a0) at ../../../../ddb/db_command.c:464
#4  0xfffffc000040ac9c in db_command_loop ()
    at ../../../../ddb/db_command.c:255
#5  0xfffffc000040fb7c in db_trap (type=0, code=0)
    at ../../../../ddb/db_trap.c:101
#6  0xfffffc000053f984 in ddb_trap (a0=1, a1=0, a2=0, entry=3, regs=0x0)
    at ../../../../arch/alpha/alpha/db_interface.c:211
#7  0xfffffc0000300194 in inc6 () at ../../../../arch/alpha/alpha/debug.s:50
#8  0xfffffc0000537ab8 in trap (a0=1, a1=18446740783697495032, 
    a2=18446739675669162144, entry=3, framep=0xfffffe0009333b60)
    at ../../../../arch/alpha/alpha/trap.c:329
#9  0xfffffc00003003a4 in XentIF ()
    at ../../../../arch/alpha/alpha/locore.s:461
#10 0xfffffc0000466048 in panic (fmt=0x1 <Address 0x1 out of bounds>)
    at ../../../../kern/subr_prf.c:226

XentIF is a PAL entry point, and alpha/include/asm.h goes out of its
way to try and make this work:

 *	Declare a palcode transfer point, and carefully construct
 *	gdb symbols with an unusual _negative_ register-save offset
 *	so that gdb can find the otherwise lost PC and then
 *	invert the vector for traceback. Also, fix up framesize,
 *	allowing for the palframe for the same reason.

... but it doesn't seem to work.

Further investigation shows that GDB thinks the size of the frame is
216 bytes:

(gdb) info frame
Stack level 9, frame at 0xfffffe0009333c38:
 pc = 0xfffffc00003003a4 in XentIF
    (../../../../arch/alpha/alpha/locore.s:461); saved pc 0xfffffc0000466048
 called by frame at 0xfffffe0009333cc8, caller of frame at 0xfffffe0009333b60
 source language asm.
 Arglist at 0xfffffe0009333c08, args: 
 Locals at 0xfffffe0009333c38, Previous frame's sp in sp
 Saved registers:
  ra at 0xfffffe0009333c18, at at 0xfffffe0009333c28, pc at 0xfffffe0009333c18

(gdb) p 0xfffffe0009333cc8 - 0xfffffe0009333c38
$7 = 144
(gdb) p 0xfffffe0009333c38 - 0xfffffe0009333b60
$8 = 216

even though asm.h sets up a directive that should tell the debugger
the frame is larger (to account for the six quadwords pushed on the
stack by the PALcode entry sequence):

	.frame	$30,(FRAME_SW_SIZE+6)*8,$26,0;   /* give gdb the real size */\

A little more digging turns up the fact that gcc is generating
DWARF-2 debug info for most of the kernel, but not for locore.o; the 
 gas info page confirms that .frame directives are not translated into
DWARF-2 info at present. I hypothesize that GDB latched on to the
DWARF-2 debug info and didn't or couldn't use the .mdebug info from
locore.o at the same time.

(However, a kernel compiled with -gstabs+ was even worse, and could
only show frame #0).


Write (or pull in?) dwarf-2 support into the alpha bits of gas.

-gstabs+ kernel: no idea.