Subject: Re: Redoing file system suspension API (update)
To: Bill Studenmund <wrstuden@netbsd.org>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 06/26/2006 20:30:20
On Mon, Jun 26, 2006 at 09:43:59AM -0700, Bill Studenmund wrote:
> On Sat, Jun 24, 2006 at 11:05:04AM +0200, Juergen Hannken-Illjes wrote:
> > On Fri, Jun 23, 2006 at 05:14:00PM -0700, Bill Studenmund wrote:
> > > 
> > > Especially if we go with the idea of having the file system grab and 
> > > release the mountpoint-transaction-lock (the current incarnation of the 
> > > "gate" you spoke of originally, the thing that makes us atomic w.r.t. a 
> > > snapshot), then it's not a problem. We just have the read and write code 
> > > in fifofs not take the transaction lock, and we're fine.
> 
> I'm sorry, but this is an important point. I have the feeling it was 
> missed.

Not sure I get it right: you mean taking the transaction lock for
read/write/ioctl in every file system while taking it for other operations
outside?

Looks difficult to maintain.

> > > Not taking a lock is fine, releasing someone else's lock and retaking one 
> > > is a problem.
> > > 
> > > Such a change would limit our exposure to cases where someone is trying to 
> > > make a transaction that involves writing to or reading from a fifo. If you 
> > > try to do that, you get what you get.
> > 
> > Both specfs and fifofs need special care because they are no real file systems.
> > Their vnodes live in real file systems that may update meta data before or
> > after they call operations on specfs/fifofs.  These updates need the transaction
> > lock.  The real operations (as long as they dont go to disk devices) cannot
> > keep this lock because they may sleep forever waiting for data.
> 
> How many transactions, other than an actual write or read, will write or 
> read a fifo?

Its not only fifo, it is also non-disk VCHR and VBLK devices.  This information
is currently hidden outside VFS.
Unlocking/relocking in specfs/fifofs would be the same it is now.  Currently
there is already a VOP_UNLOCK/VOP_LOCK in specfs/fifofs.

> Thus if we move the transaction/snapshot logging into the read or write
> call, we have fifofs and specfs skip that step, and we're fine.

See above.  I think it is easier to maintain if we try to keep the transaction
lock completely outside of file systems.

> Reads and writes do not actually update the on-disk metadata. They set 
> flags in the vnode indicating an update is needed, and an operation later 
> comes along and shoves in times & writes the node to disk. The important 
> point is that setting the "update request" flag doesn't interfeer with the 
> transaction lock.
> 
> > We could move the special treatment of fifofs up by something like
> > 
> >     if (vp->v_type != VFIFO)
> > 	vn_hold(vp);
> >     VOP_XXX(vp, ...);
> >     if (vp->v_type != VFIFO)
> > 	vn_release(vp);
> 
> Gool lord no! It'd be better to break transaction semantics than do this.
> 
> > We cannot do this for for VCHR/VBLK devices because they may be tapes that
> > also need this special treatment on VBLK devices.
> > 
> > To me it looks better to put it into fifofs the same way its put into specfs.
> > 
> > And I'm still looking for a better name instead of vn_hold/vn_release ...
> 
> vn_trans_lock()/vn_trans_unlock().

Thanks.

> What do other OSs do?

No OS I know of has something like this.  Do you have a special OS in mind?

-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)