tech-kern: Re: Locking errors

Subject: Re: Locking errors
To: Frank van der Linden <frank@wins.uva.nl>
From: Dr. Bill Studenmund <wrstuden@loki.stanford.edu>
List: tech-kern
Date: 01/30/1999 18:09:27
On Sun, 31 Jan 1999, Frank van der Linden wrote:

> On Fri, Jan 29, 1999 at 06:52:57PM -0800, Dr. Bill Studenmund wrote:
> > vclean(vp)
> > {
> > Panic if VXLOCK1 set
> > 
> > Set VXLOCK & VXLOCK1
> > 
> > VOP_LOCK(vp, LK_DRAIN | LK_INTERLOCK | LK_CANRECURSE)
> > /* The drain above wakes up all of the processes wanting this lock. They
> > see VXLOCK set and thus either return errors or sleep intending to return
> > an error on awake */
> > 
> > Clear VXLOCK  so that commands below function fine
> > 
> > Do all of the clean up that vclean does now
> > 
> > set the vnode's vnops to the deadfs ops
> > 
> > clear VXLOCK1, unlock the node, and awaken sleepers.
> 
> After clearing VXLOCK, you're doing some things that may put the
> current process to sleep (like vinvalbuf()). Other processes may
> come in and not be VXLOCK aware, i.e. they don't know that the
> node is being cleaned up. This may produce unexpected results.

But if I have an exclusive, recursive lock, my process could sleep for a
year and no one else would get at the vnode. I thought. I mean, isn't that
the meaning of an exclusive, recursive lock?

> I think introducing a VXLOCK1 flag will just further obfuscate
> things. Maybe you should look at:

VXLOCK1 was an attempt to try to decouple the three things I think VXLOCK
is doing:

1) When we call the lockmanager to LK_DRAIN, we tell all the other waiters
that this node is dying so they should abort what they are doing.

2) We keep vclean from getting called recursivly (with the idea that if we
recursed to vclean once, we might again :-)

3) We keep any other processes from touching the node once we start
cleaning.

VXLOCK1 was to decouple 2 from 1. 3 is achievable by having an exclusive
lock.

> 	a) Do what you want to do not in VOP_CLOSE but someplace
> 	   else (I don't know what you want to do, though).
> 	b) Avoid the VOP_CLOSE or get rid of it alltogether
> 	   VOP_CLOSE is mainly used in an administrative fashion,
> 	   and also in this case VOP_INACTIVE is being called
> 	   immediately afterwards which makes you do these
> 	   administraive things twice since VOP_INACTIVE will do
> 	   these as well. I think NFS is even the only filesystem
> 	   which actually does something more in it's close vop.
> 	c) Make VOP_CLOSE aware of the locking and VXLOCK status.

The problem is that I want to do something which really needs to be
wrapped in vn_lock/VOP_UNLOCK's. According to the spec, it should be
doable to the node passed to VOP_CLOSE.

What we need is a way to make all other processes see VXLOCK but let the
process doing the cleaning NOT see it. Since in the process of cleaning,
we call routines which are supposed to get unlocked (i.e. lockable)
vnodes.

My concern with 3) is that it's clugy. Also, since the problem's with
vn_lock, we have to cluge it to...

My concern w/ 1) & 2) is that I'm doing something which really ought to be
done in a close routine, and should be protected by vn_locks. :-) 

Thought?

(Thanks for actually looking at & thinking about the problem!)

Take care,

Bill