Subject: Re: CVS commit: src/sys/kern
To: Darren Reed <darrenr@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: source-changes
Date: 01/28/2006 13:06:37
On Sat, Jan 28, 2006 at 11:51:19AM +0000, Darren Reed wrote:
> This needs to be documented in options(4) and panic(9) so that people
> can learn about it.

I just fixed options.

> 
> For me, I'm not comfortable it will be of value as it presupposes that
> the ddb parser is working and that the kernel will survive until that
> time.

If the system is in such a state that ddb can't print its prompt,
it's not clear if db_stack_trace_print() will work better. Also, on some
platforms, ddb() may need some more initialisations before
db_stack_trace_print() will work (it's just theory, I've no example of
platforms where this would be the case).
Also, with your change there's no way to get the old ddb.onpanic=0
behavior: if ddb.onpanic is not 1, db_stack_trace_print will always be
called.

> 
> I'm quite dismayed that it is so hard to get information about a panic
> out of NetBSD.  Writing a crash dump doesn't appear to work (where the
> dump space is the same as swap)

Usually works for me

> and nearly every time there is a panic,
> the system faults again in ddb, either locking ddb and the system up
> or just preventing you from getting what you need.

The system is seriously messed up then. Maybe something has overwritten
the page tables, or something like that. But the most common problem leading
to a non-working ddb is when the stack pointer has jumped to nowhere, or
the kernel is out of stack space (too much data allocated on stack, or
recursive function call). I would recommend setting printing the register
contents in addition to the stack trace, to see the stack pointer value
(options DDB_COMMANDONENTER="trace;show registers")

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--