Subject: Re: how to bring a mounted filesystem to an almost clean state?
To: Frank van der Linden <fvdl@wasabisystems.com>
From: Christian Limpach <chris@pin.lu>
List: tech-kern
Date: 02/25/2003 16:37:22
Quoting Frank van der Linden <fvdl@wasabisystems.com>:

> Since you want to work with any filesystem type, VFS_SYNC is the
> only way to get consistent state. But, you also need to make sure
> that no new writes get pushed in from userspace while doing this.

do you think it would be (easily) possible to get the buf's which are flushed 
by VFS_SYNC somehow tagged?  I've now tried to block all writes which don't 
have B_VFLUSH set but this doesn't quite work since VFS_SYNC blocks (I 
haven't identified which of its writes don't have B_VFLUSH set).  When I 
release the block, VFS_SYNC finishes, the block is put back and the snapshot 
is taken.  The window for 'new' writes to get through is apparently small 
enough that the few tests I've run created only fsck-clean snapshots.  Does 
this make sense?  How reasonable is this?

> So, you need to be able to identity 'new' writes, and have them
> be delayed at some point in the kernel. Determining this point
> isn't easy, I think. At first thought, just intercepting all
> system calls seems like a good idea, but, you're not going to
> catch writes via mmap()-ed regions that way. Also, if a process
> wants to do a write which is only going to copy some bytes to
> a buffer, but not write to disk, you might want to allow that
> as well, so that the system can be somewhat responsive in that
> regard while the snapshot is being taken (though avoiding writes
> altogether would definitely be easier).

The point where I'm delaying/blocking writes is in my driver's strategy 
routine.    I have considered delaying in vn_write but I'd either have to add 
code or use whatever is there.  Locking vnodes would probably do what I want 
but is not possible.  I haven't looked at FreeBSD, but from what you wrote 
I'd imagine that it would also imply changes to the code.  I don't mind 
(making) the code changes per se, but since so far there hasn't been much 
interest in any of these features and my driver is implemented as an LKM and 
doesn't need any kernel changes so far, I don't want to give up this 
independence. 

Taking a snapshot is quite instantaneous, a pax -v writing to the device of 
which a snapshot it taken pauses maybe for half a second and most of that 
pause is caused by the VFS_SYNC call.

-- 
Christian Limpach <chris@pin.lu>