Subject: Re: Redoing file system suspension API (update)
To: None <tech-kern@netbsd.org>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 06/29/2006 12:08:59
On Wed, Jun 28, 2006 at 03:00:04PM -0700, Bill Studenmund wrote:
> On Wed, Jun 28, 2006 at 04:30:26PM +0200, Juergen Hannken-Illjes wrote:
> > On Mon, Jun 26, 2006 at 02:31:44PM -0700, Bill Studenmund wrote:
> > >
> > > How is it difficult to maintain?
> > >
> > > The idea is that we only use transaction locks above the file system if we
> > > have a real transaction.
> >
> > Implementing the transaction locks this way leads to this situation:
> >
> > A file system gets quiet. Now we are sure all its operations are either
> > complete or have not started yet. But we have locks held. Our next
> > step is to sync the current file system state to disk. None of the
> > operations we have (VFS_SYNC and friends) is usable because they will
> > deadlock. So we have to add another VFS operation to sync to disk
> > ignoring locks. Looks ugly and hard to maintain.
>
> What do you mean, "A file system gets quiet" ?
>
> I assume you mean that the file system transaction lock is exclusively
> held by the syncing or snapshotting routine. At that point, you know that
> there are no operations that will change file system metadata (and none
> that will change data) in operation. You know this as you wouldn't have
> the exclusive lock otherwise.
>
> I do not see how VFS_SYNC would not be usable in face of this. Sure,
> VFS_SYNC() AS ITS IMPLEMENTED NOW would not work, but I can not fathom a
> reason as to why we would leave VFS_SYNC unchanged if we are changing the
> locking hierarchy. :-) All a vfs_sync routine'd have to do is grab an
> exclusive transaction lock (just like the snapshotting code would do),
> then sync away.
>
> All a new VFS_SYNC needs to do is grab the exclusive transaction lock,
> then sync everything. It can ignore vnode locks due to a consequence of
> putting transaction locking in each VOP; each VOP's implementation has to
> grab at least a shared transaction lock before it can do anything, so it
> can't do anything of note without the transaction lock.
>
> By adding the transaction locking, especially in its current state which
> has a shared and an exclusive mode, we will be changing the locking
> hierarcy. It seems reasonable to me that we will need to adjust a number
> of other things too.
>
> The list of things that needs adjusting, though, does not look large.
Lets see if I got it:
- We add a transaction lock to "struct mount" behaving like a lockmgr lock
but allowing a "already have a shared lock" query.
- Transactions are enclosed in vn_trans_lock/vn_trans_unlock pairs like
if ((s = vn_trans_lock(mp, V_WAIT)) < 0)
return s;
<run the operation>
if (s)
vn_trans_unlock(mp);
where vn_trans_lock() returns -1 on error, 0 on success and 1 if the thread
already owns a shared lock.
- We add vfs operations
VFS_SUSPEND: takes the exclusive transaction lock and sync to disk .
this sync-to-disk can ignore the vnode locks.
VFS_RESUME: relase the exclusive transaction lock.
--
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)