Subject: Re: kern/10228: ktrace -c panics and freezes the system
To: Andreas Wrede <andreas@planix.com>
From: John Hawkinson <jhawk@MIT.EDU>
List: netbsd-bugs
Date: 05/29/2000 18:02:29
>Sorry, they are typos. The same couple of lines repeat over and over
>again.

"oh, ok" ;-)

>> There appear to be three independant problems:
>> 
>> a)    your original page fault failure; no clue on this. New ktrace code
>>       from Bill Sommerfeld?
>
>I don't know what the original trap is is - it scrolled of the
>screen. If need be, I can set up with serial console and repeat..

"Fixed in the next release":
sys/kern/kern_ktrace.c:
----------------------------
revision 1.43
date: 2000/05/28 15:27:51;  author: sommerfeld;  state: Exp;  lines: +5 -4
Deal with NULL file pointer for KTROP_CLEAR
----------------------------

>> b)    Recursive traceback printing. This is my fault, and I'll scratch
>>       my head and decide how best to deal with it. Perhaps we should
>>       not try to print tracebacks recursively.

>> c)    System freezes after 5 traps. No clue here.
>
>Could be more - that's how many fit on the 80x24 screen.

Ah, got it. I take it your system hangs hard and requires human
intervention after this event? Oddly enough, mine reboots
afterwards, preserving the message buffer. I start with:

  uvm_fault(0xcd62c170, 0x0, 0, 3) -> 1
  fatal page fault in supervisor mode
  trap type 6 code 2 eip c01abfdb cs 8 eflags 10297 cr2 8 cpl 0
  panic: trap
  Begin traceback...
  _trap() at _trap+0x1e5
  --- trap (number 0) ---
  uvm_fault(0xcd62c170, 0x0, 0, 3) -> 1
  fatal page fault in supervisor mode
  trap type 6 code 2 eip c01be1e1 cs 8 eflags 10282 cr2 0 cpl e000ffef
  panic: trap
  Begin traceback...
  _trap() at _trap+0x1e5
  --- trap (number 0) ---
...
  fatal bounds check fault in supervisor mode
  trap type 11 code 0 eip cd647181 cs 8 eflags 206 cr2 0 cpl e000ffef
  panic: trap
  Begin traceback...
  _trap() at _trap+0x1e5
  --- trap (number 0) ---
  fatal bounds check fault in supervisor mode
  trap type 11 code 0 eip cd647181 cs 8 eflags 202 cr2 0 cpl e000ffef
  panic: trap
  Begin traceback...
  _trap() at _trap+0x1e5
  --- trap (number 0) ---
  fatal bounds check fault in supervisor mode
  trap type 11 code 0 eip cd647181 cs 8 eflags 202 cr2 0 cpl e000ffef
  panic: trap
  Begin traceback...
  _trap() at _trap+0x1e5
  --- trap (number 0) ---
  
Vaguely interesting that we go from
"fatal page fault" to "fatal bounds check fault".

Someone else should comment here as to whether this is something worth
addressing...

--jhawk