Subject: Re: sparse, kernel-only crashdumps
To: Kurt J. Lidl <lidl@pix.net>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 10/07/2004 14:55:16
yes, this sort of thing is a big win as long as it's optional.
Solaris has had this for a while, so:
>>At a guess, someone felt that when a system has just crashed it would
>>be unwise to rely on any more data from it than absolutely necessary.
>>What if the mapping tables were (part of?) the cause of the crash?
I believe the strategy used by Solaris is not to dump the kernel virtual
address space, but rather to dump the pages in use by the kernel (and I
think that approach would at least partly resolve Jason's concerns
regarding the direct-mapped segments).
> I'm certain that going with sparse crash dumps will result in some
> small number of kernel crashes that are impossible to debug, because
> the spoor of the bug is in a page that isn't saved.
This has not been a problem in my work on Solaris (but then I'm not
working on the VM system..); as long as you retain the option of doing a
full-physical-memory dump you're not likely to lose much debuggability.
Jason wrote:
> But crash dumps don't have to be high performance :-)
True, but if they take too long they cut into developer productivity,
and, on production systems, crash dump time cuts into your availability.