tech-kern: Re: Redoing file system suspension API (update)

Subject: Re: Redoing file system suspension API (update)
To: None <tech-kern@netbsd.org>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 06/28/2006 17:59:10
On Wed, Jun 28, 2006 at 08:17:43AM -0700, Bill Studenmund wrote:
> On Tue, Jun 27, 2006 at 12:52:57PM +0200, Juergen Hannken-Illjes wrote:
> > On Mon, Jun 26, 2006 at 02:31:44PM -0700, Bill Studenmund wrote:
> > > On Mon, Jun 26, 2006 at 08:30:20PM +0200, Juergen Hannken-Illjes wrote:
> > > > On Mon, Jun 26, 2006 at 09:43:59AM -0700, Bill Studenmund wrote:
> > > > > I'm sorry, but this is an important point. I have the feeling it was 
> > > > > missed.
> > > > 
> > > > Not sure I get it right: you mean taking the transaction lock for
> > > > read/write/ioctl in every file system while taking it for other operations
> > > > outside?
> > > > 
> > > > Looks difficult to maintain.
> > > 
> > > How is it difficult to maintain?
> > 
> > We have to do it for all operations of all file systems.  And we need
> > thread-recursive locks as file systems call operations on other file systems.
> 
> I'm sorry. I do not understand the causality implied in this sentance. The 
> fact that a file system may call operations on other file systems (only 
> unionfs does this AFAIK) does not mean we need recursion.

They call VOPs, sometimes to themself, sometimes to other file systems.
A file system calling itself is the problem.

> We also don't need recursion in general. All we need is for the lock 
> routine to return "success", "failure", and "You already have the lock." 
> If we get a "failure" return, we exit whatever we're doing. If we get 
> success, we later release the lock. If we get "You already have the lock", 
> then we just skip the unlock later on.

I meant this type of recursion.  Using a lock counter or keeping state on
the stack should be the same.

> > Once an operation has the lock we cannot deny the lock to other operations
> > called from here.  Take unionfs's `copy-up' as an example.
> 
> I don't understand what you mean by "[denying] the lock". ?? If a file 
> system decides it wants to perform a transaction, it starts then ends the 
> transaction.
> 
> Note also that while you're right that we have to add this logic to 
> specific file systems (and the implicit assessment that we may have more 
> file systems than entry points that make certain transactions), we really 
> only have to add this functionality to file systems that handle snapshots.
> 
> So only ffs needs the logic for now.

Our current implementation supports snapshots on ALL leaf file systems.

> > And I'm not sure if it can be free of deadlocks doing it (with locked vnodes)
> > inside the file system.
> 
> Yes, deadlocks are an issue. However we can work around them. We put the 
> transaction lock at a certain point in the locking hierarcy, and if we 
> need to grab a lock that's further up the chain, we release our current 
> locks, grab the one we need, then re-grab others.

I must admit I dont understand this ... see my other post for the sync-to-disk
problem.

-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)