tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: locking patches for ufs_rename

On Fri, Oct 02, 2009 at 07:20:58PM +0100, Mindaugas Rasiukevicius wrote:
 > While it may not be that big, I do not see the reason why not to put
 > ufs_rename() with ufs_parentcheck() and ufs_checkpath() helpers into
 > a separate file.  It would be easier to read and work with code.

I don't see that there's any particular reason *to* change it. If you
want to do so, go right ahead, but please wait until it's committed. ;-)

 > > We have some, and I have some others that I need to find time to
 > > atfify.
 > OK, we should have a list of tests covering as much cases as one can think
 > of i.e. let's do some quality assurance. :)

Most of the cases are race conditions and so the tests basically
involve pounding on rename until it or something else breaks. The last
time I tried mostly what broke was vnode recycling.

 > > It is what we want; there's only one rename at a time per volume. This
 > > is necessary for at least two other major reasons. (Go reread the big
 > > block comment. :-) )
 > No, not in the long term, or let me rephrase it.  The current form of heavy
 > per-mount rename lock significantly reduces concurrency and we already can
 > notice that in the lockstat.

On what workload? What real workload has a large number of concurrent

 > Without major locking revamp, I wonder if we can use hashed-locks based
 > on parent vnodes for rename serialisation?  Something for mid-term.

That will not work. I'd recommend that before you start proposing
alternatives to that lock you understand what it's does. I believe the
big block comment describes it adequately, but just in case not:

1. It is necessary to deny renames of the form rename("a",
"a/b/c/d/e/f/g/h/i/j/k/a"), where the source is an ancestor of the
destination, because these detach portions of the directory tree. In
order to do this, we search up the tree from the destination to make
sure the source directory isn't there. It is then vital that it
*stays* not there until the rename is complete. Because there's an
arbitrary number of directories in between, there are only a limited
number of ways to do this: using a global lock, using two-phase
locking, or using some kind of timestamp-baesd consistency method.
Using two-phase locking means locking all the directories in the
parent search, from the destination all the way up to (in general) the
root of the fs, until the rename is complete. This is not going to
improve concurrency over a global lock and it causes various other
problems, e.g. locking from child to parent. I'm intending to look
into some form of timestamping later on.

2. If you have these three renames running concurrently:
   rename("a/b", "c/d");
   rename("c/d", "e/f");
   rename("e/f", "a/b");

where the source and destination of each rename are incommensurate by
tree order, they can deadlock. The global lock prevents this. Some
other scheme might too, but it's not exactly trivial, you'd probably
have to do two parent checks instead of one, and I'd want to see a
specific concrete proposal and an explanation of how it covers this
and other less tidy cases.

 > P.S. Regarding big comment - please make it shorter, it is easier to me
 > to just read the code (both ufs_rename and sys_rename) right now. :)

Easier, but apparently not sufficient.

David A. Holland

Home | Main Index | Thread Index | Old Index