Subject: Re: FFS journal
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 07/03/2006 11:35:58
--Y5rl02BVI9TCfPar
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jul 03, 2006 at 01:52:28PM -0400, Thor Lancelot Simon wrote:
> On Mon, Jul 03, 2006 at 10:16:50AM -0700, Bill Studenmund wrote:
> > >=20
> > > * One of the main sources of confusion about journaling is what exact=
ly a
> > >  journal contains. In the vast majority of journaling filesystems
> > >  journal only contains modifications made to filesystem metadata
> > >  (i.e. changes to directories, inodes, inode and block bitmaps). The
> > >  journal doesn't contain any user data stored in a file.
> >=20
> > You missed the difference between logical and physical journals. You=20
> > really want to do a physical journal as recovery is MUCH simpler.
>=20
> Solaris journals logical operations, doesn't it?  We asked Kirill to
> investigate using a compatible on-disk format to the Solaris implementati=
on,
> since it is the only mature implementation of journaling for FFS for which
> source code is available.

Kirill certainly does get to choose what kind of journaling to implement.

The problem with logical journaling is that you then have to build most of=
=20
the file system into the check tool, so that the check tool can fix up a=20
partial operation. Further, you have to keep the tool in-sync with the=20
on-disk file system. With a physical journal, you just play the journal=20
and all's done.

While source code is nice, I personally do not think that it's a
big-enough reason to choose a logical journal over a physical one. Our=20
kernel and Solaris's kernel are sufficiently different that I'm concerned=
=20
that differences between the two will cloud issues.

To be honest, I think what's needed will be a mostly-physical journal.=20
It's a physical journal, but you have the ability to flag certain inodes=20
as being unlinked, so that fsck knows to get rid of them. Also, it's good=
=20
to be able to flag a block as "don't write from journal". This would be=20
for a case where what was a meta data block (indirect block pointers) gets=
=20
reallocated as file data. At that point you shouldn't write anything from=
=20
the journal as: 1) the block is now file data, and 2) by that fact, you=20
know anything you might want to write is stale.

> Obviously, he is free to decide to journal physical blocks instead, and
> there are good reasons why he might do that.  The corresponding project
> in FreeBSD is a device driver (actually, part of the GEOM layer) which
> journals physical blocks, as I understand it.

Hmm... I don't like that idea. Mainly as we then can only journal to a=20
device; you can't say journal to space in the file system.

Take care,

Bill

--Y5rl02BVI9TCfPar
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFEqWOOWz+3JHUci9cRAnPLAJ9wgZmb4o280Ov4qhf0cpvtSTniMACfdCVL
rOXKqpZQtp+ybL7fQmyHvA0=
=H9En
-----END PGP SIGNATURE-----

--Y5rl02BVI9TCfPar--