Subject: Re: Program recovery using checkpointing
To: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
From: Antti Kantee <pooka@cs.hut.fi>
List: tech-kern
Date: 03/13/2005 22:02:54
On Fri Mar 11 2005 at 12:38:14 +0100, Ignatios Souvatzis wrote:
> On Thu, Mar 10, 2005 at 06:15:11AM -0800, Kamal R. Prasad wrote:
> > Hello, 
> > 
> >  I have written a paper (and submitted to USENIX). 
> > 
> > A copy of the paper is available at:-
> > ftp.netbsd.org:/pub/NetBSD/misc/prasad/paper_1.pdf
> > 
> > The src code based on dragonflybsd is available at:-
> > ftp.netbsd.org:/pub/NetBSD/misc/prasad/mypatch.tar.gz
> > 
> > Pl. feel free to use/port it -and if you have any
> > questions, pl. let me know.
> 
> That author will speak up himself, I guess, but: what are the
> architectural differences to kernel- assisted but application
> driven checkpointing as described in [1]?

I fiddled around with checkpointing and process migration for my master's
thesis and ended up noticing that everything is nice and fine before you
start doing i/o (big surprise there).  So to make a long story short I
ended up doing checkpointing at the user level, but adding some kernel
optimizations for saving only the deltas of the memory space between
each checkpoint.

Of course I wasn't aiming for an "exact" checkpoint, only a "semantic"
checkpoint.  After all, if you're doing checkpointing from the program
code, you might as well exploit the fact that the program knows its
own behaviour.

But, AFAICT, the proposed method is also application-driven checkpointing,
since it provides a programming interface instead of operating
transparently in the background.

> [1] Antti Kantee: Using Application-Driven Checkpointing Logic
> for Hot Spare High Availability, proceedings of the 3rd European
> BSD Conference, Karlsruhe 2004,
> http://www.eurobsdcon2004.de/uploads/media/EBSD04_36.pdf
> Karlsruhe

Should anyone be interested, the entire story containing a bit more
mumbling and rambling is available at:
http://www.cs.hut.fi/~pooka/school/thesis/

-- 
Antti Kantee <pooka@iki.fi>                     Of course he runs NetBSD
http://www.iki.fi/pooka/                          http://www.NetBSD.org/
                        "The dish washer returns!"