Subject: Re: Which snapshot strategy to use? was: How to capture all file system writes (fwd)
To: NetBSD Kernel Technical Discussion <tech-kern@netbsd.org>
From: gabriel rosenkoetter <gr@eclipsed.net>
List: tech-kern
Date: 10/23/2003 16:02:35
--3eH4Qcq5fItR5cpy
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Oct 23, 2003 at 10:42:27AM -0700, Greywolf wrote:
> I was going to ask, "this is snapshotting, not journaling, right?", but
> then it occurred to me that if we're going to snapshot the filesystem,
> wouldn't a way be to freeze write access, queue up the writes in a
> journal, dump off the snapshot to wherever, release the access, and
> then keep going?  That's journaling, effectively, unless I'm sitting
> in left field waiting for the White Sox to come up to bat again (which
> is entirely possible).

Nothing personal, but you're approaching the track.

The design principles for a journaled FS and for a snapshot (of any
type of FS) are fundamentally different.

As Jason says, journals typically only have metadata, so that you
know which data blocks are valid post-crash, but not a full copy
of the data that belongs in those blocks. The point is that they're
quick to write to, and they only get written to when a given block
is consistent.

More than that, the idea of a snapshot is to provide a quiesced file
system (typically for backups or data migration) while continuing to
use the system normally, including both writes and reads. That's why
you allocate a reasonably-sized lump of disk space and COW into it
as writes happen. Reads from the usual mount point pass through the
COW filter, reads from the snapshot version come from the underlying
disk (or virtualization of disk).

I'm speaking here from Veritas VM's snapof=3D mount option and Sun's
fssnap. I haven't looked at Kirk's stuff. But I'm guessing it's not
too far different in functionality (though obviously implementation
details would be different). I think, especially based on how well
Veritas's version works and that it does things this way, that doing
this at the block level makes the most sense. But that's not a
particularly considered or experienced opinion when it comes to the
file systems NetBSD supports.

Jason, I see why the normal techniques for making a snapshot file
system would be pathologically wrong for LFS. But, um, what would be
*right*?

> ...or do you just keep write logs (be they PBL or whatever) and use
> those as the snapshots?

Don't think log. Think copy-on-write into a buffer that you have
every intention of flushing back to the underlying file system on a
block (whether that means disk block or virtual block) basis, not on
replaying like you do a log.

Not directly related:

Do we have any intention of having checkpoints ala VxFS 5? (I find
those especially useful in production Oracle environments under
Solaris with Veritas VM managing the disk.)

Are we on our way to a full-on virtualized disk manager? (If so,
keen. But that'll be a massive amount of code...)

Is Kirk already doing (some of) this in FFS2? (I don't exactly pay
attention to FreeBSD.)

--=20
gabriel rosenkoetter
gr@eclipsed.net

--3eH4Qcq5fItR5cpy
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQE/mDPb9ehacAz5CRoRAmhHAJsGACjk1oAJ4Hy3dfON3AK9zsISGgCcD0st
aBzy3VNKyv6xc5ue6EQxArg=
=Hsgw
-----END PGP SIGNATURE-----

--3eH4Qcq5fItR5cpy--