Subject: Re: Read-write vnode locks
To: Charles M. Hannum <root@ihack.net>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 09/11/1999 13:00:07
There are a couple things which will need to be reworked which you
didn't mention in your message, mostly related to read-to-write
upgrade issues.

Our current lockmgr provides two ways to upgrade a shared lock to an
exclusive lock.  Both are .. problematic.

LK_UPGRADE is guaranteed to work, but involves releasing the shared
lock before reacquiring a exclusive lock; this means that other
processes may get in with an exclusive lock (and change things)
during the upgrade, which means that you can't make many assumptions
based on values you looked at while holding the read lock.

LK_EXCLUPGRADE does not allow another process to get in sideways, but
can fail (if another process is already waiting for an EXCLUPGRADE on
the same lock), which means you need some recovery code which handles
this case.

For instance, ufs_lookup of a nonexistant name leaves a few
breadcrumbs around in the directory inode and the cnp structure to
tell a subsequent ufs_direnter() call where to put the name;
currently, it can get away with this because exclusive locks are used;
this would need to be reworked.  (look at the EJUSTRETURN return path
in ufs_lookup).

Each of the directory ops (VOP_*) would thus need to upgrade the lock,
and then revalidate the directory entries and possibly fail.  some
sort of common routine for this revalidation would make sense; your
design would make this per-filesystem and put it under the VOP layer..

BTW, count me in as part of the set of people willing to work on
making this change..  

					- Bill