NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/44260: ddb stack trace from interrupt context is broken



>Number:         44260
>Category:       kern
>Synopsis:       ddb stack trace from interrupt context is broken
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 21 17:55:00 +0000 2010
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current >= 2009.03.07.22.02.17
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

When entering ddb via an interrupt, such as when sending a break
sequence to a serial console, the ddb "trace" command only shows
the call stack as far back as the interrupt; it does not show the
functions that were executing when the interrupt occurred.

This used to work, albeit not very reliably.  Using automated binary
search, I have tracked down the regression to a set of commits by ad
on 2009.03.07.22.02.16 - 2009.03.07.22.02.17, with the commit message:

  Make ddb compile and work in userspace. Mostly this is comprised of three
  types of changes:

  - Add a few new methods to replace stuff like p_find(), CPU_INFO_FOREACH.

  - Use db_read_bytes() instead of accessing kernel structures directly,
    and similar changes.

  - Add ifdef _KERNEL where the above hasn't been done, and an XXX comment.

This is on i386 (emulated by qemu); other architectures may or may not
be affected.

Here's a couple of stack traces illustrating the problem:

  db{0}> trace
  
breakpoint(0,c4091f3c,c04a6193,c42b3e3c,71,c0ed3000,c0ed4000,800,c42e67c6,c4091f
  1c) at netbsd:breakpoint+0x4
  
comintr(c42b3d10,c4091f30,7,c0af0010,c3e80030,c0ad0010,c04a0010,c0a4e84c,c3e8ed2
  0,c4062d0c) at netbsd:comintr+0x5a6
  Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xb5
  --- interrupt ---
  0
  db{0}> 

  db{0}> trace
  
breakpoint(c0a4e800,dce,4d10ce23,c42b3e3c,71,c0ed3004,c0ed4000,800,c0afc2c6,c3e8
  ed20) at netbsd:breakpoint+0x4
  comintr(c42b3d10,c4062cc8,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5a6
  DDB lost frame for netbsd:Xintr_ioapic_edge7+0xb5, trying 0xc4091f74
  Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xb5
  fatal page fault in supervisor mode
  trap type 6 code 0 eip c024b860 cs 8 eflags 246 cr2 0 ilevel 8
  kernel: supervisor trap page fault, code=0
  Faulted in DDB; continuing...
  db{0}> 

For comparison, here is a stack trace from a working version:

  db{0}> trace
  breakpoint(0,3f8,5,c0499e8d,ca18c108,ca5aaa2c,71,c0f74010,c0f75000,800) at 
netbs
  d:breakpoint+0x4
  comintr(ca5aa910,ca17ecc8,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5b5
  DDB lost frame for netbsd:Xintr_legacy4+0xbb, trying 0xca5a0f74
  Xintr_legacy4() at netbsd:Xintr_legacy4+0xbb
  --- interrupt ---
  --- switch to interrupt stack ---
  
x86_stihlt(ca189c80,0,c098d240,ca189c80,c047c1e0,ca189c80,0,c01002e1,ca189c80,0)
   at netbsd:x86_stihlt+0x5
  idle_loop(ca189c80,0,c01002cd,0,c01002cd,0,0,0,0,0) at netbsd:idle_loop+0xe6
  db{0}> 

With builds from sources from before 2009.03.07.22.02.16, the stack
trace successfully reaches either the "main" or the "idle_loop"
function approximately half the time.  Using sources from
2009.03.07.22.02.17 or later, I have never seen the stack trace
reach either of those functions.

>How-To-Repeat:

pkg_add py-anita
anita --sets=kern-GENERIC,modules,base,etc interact 
http://nyftp.netbsd.org/pub/NetBSD-daily/HEAD/201012201100Z/i386/
[wait for a login prompt]
[press "control-a b" to send a break sequence the virtual serial console]
trace

>Fix:



Home | Main Index | Thread Index | Old Index