NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/44260: ddb stack trace from interrupt context is broken

>Number:         44260
>Category:       kern
>Synopsis:       ddb stack trace from interrupt context is broken
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 21 17:55:00 +0000 2010
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current >= 2009.
System: NetBSD
Architecture: i386
Machine: i386

When entering ddb via an interrupt, such as when sending a break
sequence to a serial console, the ddb "trace" command only shows
the call stack as far back as the interrupt; it does not show the
functions that were executing when the interrupt occurred.

This used to work, albeit not very reliably.  Using automated binary
search, I have tracked down the regression to a set of commits by ad
on 2009. - 2009., with the commit message:

  Make ddb compile and work in userspace. Mostly this is comprised of three
  types of changes:

  - Add a few new methods to replace stuff like p_find(), CPU_INFO_FOREACH.

  - Use db_read_bytes() instead of accessing kernel structures directly,
    and similar changes.

  - Add ifdef _KERNEL where the above hasn't been done, and an XXX comment.

This is on i386 (emulated by qemu); other architectures may or may not
be affected.

Here's a couple of stack traces illustrating the problem:

  db{0}> trace
  1c) at netbsd:breakpoint+0x4
  0,c4062d0c) at netbsd:comintr+0x5a6
  Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xb5
  --- interrupt ---

  db{0}> trace
  ed20) at netbsd:breakpoint+0x4
  comintr(c42b3d10,c4062cc8,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5a6
  DDB lost frame for netbsd:Xintr_ioapic_edge7+0xb5, trying 0xc4091f74
  Xintr_ioapic_edge7() at netbsd:Xintr_ioapic_edge7+0xb5
  fatal page fault in supervisor mode
  trap type 6 code 0 eip c024b860 cs 8 eflags 246 cr2 0 ilevel 8
  kernel: supervisor trap page fault, code=0
  Faulted in DDB; continuing...

For comparison, here is a stack trace from a working version:

  db{0}> trace
  breakpoint(0,3f8,5,c0499e8d,ca18c108,ca5aaa2c,71,c0f74010,c0f75000,800) at 
  comintr(ca5aa910,ca17ecc8,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5b5
  DDB lost frame for netbsd:Xintr_legacy4+0xbb, trying 0xca5a0f74
  Xintr_legacy4() at netbsd:Xintr_legacy4+0xbb
  --- interrupt ---
  --- switch to interrupt stack ---
   at netbsd:x86_stihlt+0x5
  idle_loop(ca189c80,0,c01002cd,0,c01002cd,0,0,0,0,0) at netbsd:idle_loop+0xe6

With builds from sources from before 2009., the stack
trace successfully reaches either the "main" or the "idle_loop"
function approximately half the time.  Using sources from
2009. or later, I have never seen the stack trace
reach either of those functions.


pkg_add py-anita
anita --sets=kern-GENERIC,modules,base,etc interact
[wait for a login prompt]
[press "control-a b" to send a break sequence the virtual serial console]


Home | Main Index | Thread Index | Old Index