tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Patches for journalling support

On Sun, Mar 02, 2008 at 11:43:04PM +1100, Simon Burge wrote:
> Simon Burge wrote:
> > There are three known issues:
> > 
> >  - "mount -u -o log" doesn't work yet, so you can't have logging enabled
> >    on the root filesystem. "mount -u -o log" is disabled in this patch
> >    (see around line 222 of sys/kern/vfs_syscalls.c).
> Related to this is also a problem when the device node for a
> logged filesystem isn't on a ffs filesystem (for example if
> you have a tmpfs /dev).  PR kern/38057 has more details.  For
> now, you need to use an ffs /dev to use journalling.

Ok, the basic problem is that when you fsync a device node, you end up 
calling the routines in the file system that contains the device, not the 
file system that is mounted on it.

The key problem I see is that there are two different kinds of syncing you 
can do to a vnode, and we lump the two together. And for everything except 
a device node, that lumping is correct.

One kind of syncing is to sync the vnode's inode to stable storage. This 
has to be done by the file system that contains the vnode.

The other kind of syncing is to flush the buffers. This has to be done by 
the file system that generates the dirty pages. This is only the file 
system that contains the vnode if we're talking root when /dev is in /.

There are two suggestions in the PR as to what to do. I think that yamt's
suggestion of having code that wants to flush buffers call something off
of the mount point (if there is one) is correct. Whatever code currently
wants to flush buffers off of a device that's mounted should have the
mounted file system do it.

I like ad's suggestion of offering abstraction hooks ("register me" vs 
"unregister me"), but I think we really want to register, "I'm mounted on 
this device." I think that if a file system is mounted on a device, it has 
to support whatever we expect file systems to support. Put another way, if
we had explicit, "I can do this," callbacks, we leave open the possibility 
for a file system to be mounted on a device and not register such a 
callback. I dislike that potential and prefer we be more explicit in 
considering that an error. :-)

I realize that my changing of ad's suggestion is effectively just adding 
wrappers to setting v_specmountpoint. :-)

I think the right thing to do is either add a new VFS op or add a new 
variant to VFS_SYNC(). And all file systems should support it.

Take care,


Attachment: pgpy7Rm1qydH6.pgp
Description: PGP signature

Home | Main Index | Thread Index | Old Index