Subject: Re: port-sun3/4511: savecore will save old dumps repeatedly...
To: Chris G. Demetriou <cgd@pa.dec.com>
From: Greg A. Woods <woods@most.weird.com>
List: netbsd-bugs
Date: 11/19/1997 20:58:50
[ On Tue, November 18, 1997 at 15:24:59 (-0800), Chris G. Demetriou wrote: ]
> Subject: Re: port-sun3/4511: savecore will save old dumps repeatedly... 
>
> It is supposed to.  I've looked at the replies that you made to the
> PR, and your analysis of what the -c flag does seems incorrect (if i'm
> reading it correctly 8-).
> 
> if -c is specified, savecore _only_ clears the dump's magic number (by
> calling clear_dump()).

Yes, you're right.  The manual page is, shall we say, a lot less than
clear about this fact.  My interpretation of the manual page before I
read the code was that '-c' was necessary in a normal invocation if you
ever want the dump image to be invalidated.

As is I'm tempted to try setting ``savecore_flags="-vz; savecore -c"''
in order to ensure that it only runs once regardless.

> Note that on error conditions, save_core() or other functions might
> exit before the dump is cleared.  However, in the case that the core
> is successfully saved, it should be cleared by clear_dump().

Hmm...  I would suggest that very few errors should cause the
clear_dump() process to be skipped, since most would indicate permanent
errors and indeed if there's any chance to manually recover the dump
after boot then the '-f' flag can be used.  Depending on what one's
interested in it's often unlikely that a subsequent savecore will be of
any use after multi-user mode has started and even a few blocks of the
dump device (which is almost always the same as the primary swap device)
have been overwritten.  I suppose kernel space won't be trashed right
away, but any user space might be, depending on how the VM system
allocates swap space vs. how the dump is laid out.

> If it is re-saving after a successful dump, I can't see why that would
> be (if -c works to clear the dump).

I've been unable to reproduce this problem under test conditions because
the one panic I've had the chance to repeat seems to cause corrupt dumps
that cannot be successfully saved, though I will swear that it has more
than once happened without any error or warning messages.

The dump's I've been able to create since have all resulted in:

Nov 19 14:17:20 sometimes savecore: WARNING: EOF on dump device
Nov 19 14:17:20 sometimes savecore: WARNING: core may be incomplete

This is an example where unless /etc/rc had picked up a completely bogus
savecore binary there's no possibilty of doing any better manually so
indeed the dump image should be invalidated in any case.  If there is
any possibility of recovery after the fact then 'savecore -f' is still
an available option as I've already mentioned.

Given all this discussion I'll say that '-c' is still a useful option,
though the current documentation for it is bogus.

-- 
							Greg A. Woods

+1 416 443-1734      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>