Subject: port-i386/4026: kernel post-mortem on i386 can't find curproc's stack
To: None <gnats-bugs@gnats.netbsd.org>
From: John Kohl <jtk@kolvir.arlington-heights.ma.us>
List: netbsd-bugs
Date: 08/22/1997 22:27:53
>Number:         4026
>Category:       port-i386
>Synopsis:       kernel post-mortem on i386 can't find curproc's stack
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 22 19:35:02 1997
>Last-Modified:
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        NetBSD-current, 1997/08/22
>Environment:
	
System: NetBSD pattern.arlington-heights.ma.us 1.2G NetBSD 1.2G (PATTERN) #34: Fri Jul 25 07:28:09 EDT 1997 jtk@pattern.arlington-heights.ma.us:/u4/sandbox/src/sys/arch/i386/compile/PATTERN i386


>Description:

Examining the stack trace of curproc on i386 kernel crash dumps doesn't
work currently, but examining stack traces of other processes does work.
(See also PR #723.)

I dug into this, and here's what I found.

In /usr/src/gnu/usr.bin/gdb/gdb/arch/i386/i386b-nat.c:fetch_kcore_registers(),
the code assumes that the process for which it is fetching kernel
registers is not active, and thus its %eip,%ebx,%esi,%edi can be found
at the bottom of the stack where cpu_switch() would have left them.
It also assumes that pcb->pcb_tss.tss_esp & _ebp are the current stack
& base pointers (which is true if the process is not currently running),
so it uses them to find the saved registers.

However, if the process for which it's fetching kernel registers is
curproc, then the bottom of the stack does _not_ contain
%eip,%ebx,%esi,%edi, and pcb->pcb_tss.tss_esp is stale (it is the last
stack pointer restored to the process, which certainly isn't correct
anymore).

So, to fix this, two things need to happen:

(a) when dropping a crash dump, the i386 kernel needs to save the
current registers somewhere for gdb to find them.  Putting them into the
current TSS seems safe, as they'd get overwritten with valid current
copies in case of a task switch.

(Note: I spoke briefly with Charles Hannum about this problem, and he
opined (roughly--this is my memory speaking) that we really need to save
essentially all the current registers to provide a complete debugging
environment.  The patches below are sufficient for a stack trace but not
really quite right.  However, they are an immense step up from no stack
trace at all on panics!)

(b) gdb needs to be taught where to find these registers if it's
fetching for curproc.

That means that fetch_kcore_registers() needs an additional argument
telling it whether the registers to fetch are from curproc, and
kcorelow.c:get_kcore_registers() needs to pass that argument.

[Other ports' fetch_kcore_registers will need updating to accept the new
argument and (possibly) ignore it.  I'm not happy with the diffs below,
but perhaps they'll inspire someone else to come up with a more sensible
interface?]

>How-To-Repeat:
	take a kernel crash dump (shutdown -d now)
	try to read curproc's stack on the crash dump.  You can't.

>Fix:

Here's a very kludgy set of diffs (existence proof that this can work):

Index: machdep.c
===================================================================
RCS file: /u3/cvsroot/src/sys/arch/i386/i386/machdep.c,v
retrieving revision 1.126
diff -c -r1.126 machdep.c
*** machdep.c	1997/07/09 22:15:35	1.126
--- machdep.c	1997/07/10 00:59:45
***************
*** 1030,1037 ****
  	splhigh();
  
  	/* Do a dump if requested. */
! 	if ((howto & (RB_DUMP | RB_HALT)) == RB_DUMP)
  		dumpsys();
  
  haltsys:
  	doshutdownhooks();
--- 1030,1048 ----
  	splhigh();
  
  	/* Do a dump if requested. */
! 	if ((howto & (RB_DUMP | RB_HALT)) == RB_DUMP) {
! 	    /* store ebx, esp, ebp, esi, edi, eip in TSS for gdb */
! 	    if (curpcb) {
! 		__asm("movl %%ebx,%0" : "=m" (curpcb->pcb_tss.__tss_ebx));
! 		__asm("movl %%esp,%0" : "=m" (curpcb->pcb_tss.tss_esp));
! 		__asm("movl %%ebp,%0" : "=m" (curpcb->pcb_tss.tss_ebp));
! 		__asm("movl %%esi,%0" : "=m" (curpcb->pcb_tss.__tss_esi));
! 		__asm("movl %%edi,%0" : "=m" (curpcb->pcb_tss.__tss_edi));
! /*		__asm("movl %%eip,%0" : "=m" (curpcb->pcb_tss.__tss_eip));*/
! 		curpcb->pcb_tss.__tss_eip = (int)&cpu_reboot+4;
! 	    }
  		dumpsys();
+ 	}
  
  haltsys:
  	doshutdownhooks();
*** kcorelow.c	Mon May  6 07:18:54 1996
--- obj/kcorelow.c	Wed Jul  9 20:55:06 1997
***************
*** 252,258 ****
  	 * Zero out register set then fill in the ones we know about.
  	 */
  	clear_regs();
! 	fetch_kcore_registers (&pcb);
  }
  
  /* If mourn is being called in all the right places, this could be say
--- 252,258 ----
  	 * Zero out register set then fill in the ones we know about.
  	 */
  	clear_regs();
! 	fetch_kcore_registers (&pcb, cur_proc == curProc());
  }
  
  /* If mourn is being called in all the right places, this could be say
*** arch/i386/i386b-nat.c	Thu Mar  7 07:13:22 1996
--- obj/i386b-nat.c	Wed Jul  9 21:04:17 1997
***************
*** 334,365 ****
  #endif
  
  void
! fetch_kcore_registers(pcb)
  	struct pcb *pcb;
  {
  	int i, regno, regs[4];
  
          /*
!          * get the register values out of the sys pcb and
           * store them where `read_register' will find them.
           */
  	if (target_read_memory(pcb->pcb_tss.tss_esp+4, regs, sizeof regs))
  		error("Cannot read ebx, esi, and edi.");
! 	for (i = 0, regno = 0; regno < 3; regno++)
  		supply_register(regno, (char *)&i);
! 	supply_register(3, (char *)&regs[2]);
! 	supply_register(4, (char *)&pcb->pcb_tss.tss_esp);
! 	supply_register(5, (char *)&pcb->pcb_tss.tss_ebp);
! 	supply_register(6, (char *)&regs[1]);
! 	supply_register(7, (char *)&regs[0]);
! 	supply_register(8, (char *)&regs[3]);
! 	for (i = 0, regno = 9; regno < 10; regno++)
  		supply_register(regno, (char *)&i);
  #if 0
  	i = 0x08;
! 	supply_register(10, (char *)&i);
  	i = 0x10;
! 	supply_register(11, (char *)&i);
  #endif
  	/* XXX 80387 registers? */
  }
--- 334,391 ----
  #endif
  
  void
! fetch_kcore_registers(pcb, iscurproc)
  	struct pcb *pcb;
+ 	int iscurproc;
  {
  	int i, regno, regs[4];
  
+ 	/* struct reg:
+ 	int	r_eax;	0
+ 	int	r_ecx;	1
+ 	int	r_edx;	2
+ 	int	r_ebx;	3
+ 	int	r_esp;	4
+ 	int	r_ebp;	5
+ 	int	r_esi;	6
+ 	int	r_edi;	7
+ 	int	r_eip;	8
+ 	int	r_eflags;	9
+ 	int	r_cs;	10
+ 	int	r_ss;	11
+ 	int	r_ds;	12
+ 	int	r_es;	13
+ 	int	r_fs;	14
+ 	int	r_gs;	15
+ 	*/
          /*
!          * get the register values out of the stack (via the sys pcb), and
           * store them where `read_register' will find them.
           */
+ 	if (iscurproc) {
+ 	    regs[0] = pcb->pcb_tss.__tss_edi; /* they're not on the stack; cpu_reboot saved them here... */
+ 	    regs[1] = pcb->pcb_tss.__tss_esi;
+ 	    regs[2] = pcb->pcb_tss.__tss_ebx;
+ 	    regs[3] = pcb->pcb_tss.__tss_eip;
+ 	} else
  	if (target_read_memory(pcb->pcb_tss.tss_esp+4, regs, sizeof regs))
  		error("Cannot read ebx, esi, and edi.");
! 	for (i = 0, regno = 0; regno < 3; regno++) /* eax, ecx, edx = 0 */
  		supply_register(regno, (char *)&i);
! 	/* order on stack comes from locore.s:cpu_switch() */
! 	supply_register(3, (char *)&regs[2]); /* ebx */
! 	supply_register(4, (char *)&pcb->pcb_tss.tss_esp); /* esp */
! 	supply_register(5, (char *)&pcb->pcb_tss.tss_ebp); /* ebp */
! 	supply_register(6, (char *)&regs[1]); /* esi */
! 	supply_register(7, (char *)&regs[0]); /* edi */
! 	supply_register(8, (char *)&regs[3]); /* eip */
! 	for (i = 0, regno = 9; regno < 10; regno++) /* eflags = 0*/
  		supply_register(regno, (char *)&i);
  #if 0
  	i = 0x08;
! 	supply_register(10, (char *)&i); /* cs */
  	i = 0x10;
! 	supply_register(11, (char *)&i); /* ss */
  #endif
  	/* XXX 80387 registers? */
  }
>Audit-Trail:
>Unformatted: