Subject: Re: sparse, kernel-only crashdumps
To: Kurt J. Lidl <lidl@pix.net>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 10/07/2004 14:55:16
yes, this sort of thing is a big win as long as it's optional.

Solaris has had this for a while, so:

>>At a guess, someone felt that when a system has just crashed it would 
>>be unwise to rely on any more data from it than absolutely necessary.  
>>What if the mapping tables were (part of?) the cause of the crash?

I believe the strategy used by Solaris is not to dump the kernel virtual 
address space, but rather to dump the pages in use by the kernel (and I 
think that approach would at least partly resolve Jason's concerns 
regarding the direct-mapped segments).

> I'm certain that going with sparse crash dumps will result in some
> small number of kernel crashes that are impossible to debug, because
> the spoor of the bug is in a page that isn't saved. 

This has not been a problem in my work on Solaris (but then I'm not 
working on the VM system..); as long as you retain the option of doing a 
full-physical-memory dump you're not likely to lose much debuggability.

Jason  wrote:
 > But crash dumps don't have to be high performance :-)

True, but if they take too long they cut into developer productivity, 
and, on production systems, crash dump time cuts into your availability.