Subject: Re: how to bring a mounted filesystem to an almost clean state?
To: Christian Limpach <chris@pin.lu>
From: Frank van der Linden <fvdl@wasabisystems.com>
List: tech-kern
Date: 02/25/2003 12:31:45
On Fri, Feb 14, 2003 at 04:48:01PM +0100, Christian Limpach wrote:
> I'm looking for advice on how to bring a mounted filesystem to an almost 
> clean state.  I have a device driver which lets me take a snapshot at the 
> blockdevice level and doing this to a mounted filesystem I get a snapshot 
> which is in a more or less clean state.

Yes, this is not an easy problem. If you're looking for a specific
snapshot solution, picking a filesystem to which it comes naturally
(like LFS) is the easiest way to go. However, you want the most
generic solution: a snapshotting device that can be used for any
filesystem. I quite like that concept, and have thought about this
in the past, but never got around do implementing anything.

Since you want to work with any filesystem type, VFS_SYNC is the
only way to get consistent state. But, you also need to make sure
that no new writes get pushed in from userspace while doing this.

So, you need to be able to identity 'new' writes, and have them
be delayed at some point in the kernel. Determining this point
isn't easy, I think. At first thought, just intercepting all
system calls seems like a good idea, but, you're not going to
catch writes via mmap()-ed regions that way. Also, if a process
wants to do a write which is only going to copy some bytes to
a buffer, but not write to disk, you might want to allow that
as well, so that the system can be somewhat responsive in that
regard while the snapshot is being taken (though avoiding writes
altogether would definitely be easier).

Locking vnodes is not going to work, sometimes locks must acquired
while flushing, and you'll end up locking against yourself.

You could investigate what FreeBSD has done; they needed similar
functionality for FFS snapshots, and have functions like
vn_start_write which handles the suspension. It's likely not
quite applicable to NetBSD (our VM/buffer cache integration
is different), and might not be generic enough for what you want,
but it you may learn from what they've done.


- Frank

-- 
Frank van der Linden                                    fvdl@wasabisystems.com
==============================================================================
Quality NetBSD Development, Support & Service.   http://www.wasabisystems.com/