Subject: Re: Redoing file system suspension API (update)
To: None <>
From: Juergen Hannken-Illjes <>
List: tech-kern
Date: 06/29/2006 12:08:59
On Wed, Jun 28, 2006 at 03:00:04PM -0700, Bill Studenmund wrote:
> On Wed, Jun 28, 2006 at 04:30:26PM +0200, Juergen Hannken-Illjes wrote:
> > On Mon, Jun 26, 2006 at 02:31:44PM -0700, Bill Studenmund wrote:
> > > 
> > > How is it difficult to maintain?
> > > 
> > > The idea is that we only use transaction locks above the file system if we
> > > have a real transaction.
> > 
> > Implementing the transaction locks this way leads to this situation:
> > 
> > A file system gets quiet.  Now we are sure all its operations are either
> > complete or have not started yet.  But we have locks held.  Our next
> > step is to sync the current file system state to disk.  None of the
> > operations we have (VFS_SYNC and friends) is usable because they will
> > deadlock.  So we have to add another VFS operation to sync to disk
> > ignoring locks.  Looks ugly and hard to maintain.
> What do you mean, "A file system gets quiet" ?
> I assume you mean that the file system transaction lock is exclusively 
> held by the syncing or snapshotting routine. At that point, you know that 
> there are no operations that will change file system metadata (and none 
> that will change data) in operation. You know this as you wouldn't have 
> the exclusive lock otherwise.
> I do not see how VFS_SYNC would not be usable in face of this. Sure,
> VFS_SYNC() AS ITS IMPLEMENTED NOW would not work, but I can not fathom a
> reason as to why we would leave VFS_SYNC unchanged if we are changing the
> locking hierarchy. :-) All a vfs_sync routine'd have to do is grab an
> exclusive transaction lock (just like the snapshotting code would do),
> then sync away.
> All a new VFS_SYNC needs to do is grab the exclusive transaction lock, 
> then sync everything. It can ignore vnode locks due to a consequence of 
> putting transaction locking in each VOP; each VOP's implementation has to 
> grab at least a shared transaction lock before it can do anything, so it 
> can't do anything of note without the transaction lock.
> By adding the transaction locking, especially in its current state which 
> has a shared and an exclusive mode, we will be changing the locking 
> hierarcy. It seems reasonable to me that we will need to adjust a number 
> of other things too.
> The list of things that needs adjusting, though, does not look large.

Lets see if I got it:

- We add a transaction lock to "struct mount" behaving like a lockmgr lock
  but allowing a "already have a shared lock" query.

- Transactions are enclosed in vn_trans_lock/vn_trans_unlock pairs like

  if ((s = vn_trans_lock(mp, V_WAIT)) < 0)
	return s;

  <run the operation>

  if (s)

  where vn_trans_lock() returns -1 on error, 0 on success and 1 if the thread
  already owns a shared lock.

- We add vfs operations

  VFS_SUSPEND: takes the exclusive transaction lock and sync to disk .
	       this sync-to-disk can ignore the vnode locks.

  VFS_RESUME:  relase the exclusive transaction lock.

Juergen Hannken-Illjes - - TU Braunschweig (Germany)