Subject: Re: Redoing file system suspension API (update)
To: None <tech-kern@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 06/26/2006 14:31:44
--dc+cDN39EJAMEtIO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jun 26, 2006 at 08:30:20PM +0200, Juergen Hannken-Illjes wrote:
> On Mon, Jun 26, 2006 at 09:43:59AM -0700, Bill Studenmund wrote:
> > On Sat, Jun 24, 2006 at 11:05:04AM +0200, Juergen Hannken-Illjes wrote:
> > > On Fri, Jun 23, 2006 at 05:14:00PM -0700, Bill Studenmund wrote:
> > > >=20
> > > > Especially if we go with the idea of having the file system grab an=
d=20
> > > > release the mountpoint-transaction-lock (the current incarnation of=
 the=20
> > > > "gate" you spoke of originally, the thing that makes us atomic w.r.=
t. a=20
> > > > snapshot), then it's not a problem. We just have the read and write=
 code=20
> > > > in fifofs not take the transaction lock, and we're fine.
> >=20
> > I'm sorry, but this is an important point. I have the feeling it was=20
> > missed.
>=20
> Not sure I get it right: you mean taking the transaction lock for
> read/write/ioctl in every file system while taking it for other operations
> outside?
>=20
> Looks difficult to maintain.

How is it difficult to maintain?

The idea is that we only use transaction locks above the file system if we
have a real transaction.

> > > > Such a change would limit our exposure to cases where someone is tr=
ying to=20
> > > > make a transaction that involves writing to or reading from a fifo.=
 If you=20
> > > > try to do that, you get what you get.
> > >=20
> > > Both specfs and fifofs need special care because they are no real fil=
e systems.
> > > Their vnodes live in real file systems that may update meta data befo=
re or
> > > after they call operations on specfs/fifofs.  These updates need the =
transaction
> > > lock.  The real operations (as long as they dont go to disk devices) =
cannot
> > > keep this lock because they may sleep forever waiting for data.
> >=20
> > How many transactions, other than an actual write or read, will write o=
r=20
> > read a fifo?
>=20
> Its not only fifo, it is also non-disk VCHR and VBLK devices.  This infor=
mation
> is currently hidden outside VFS.
> Unlocking/relocking in specfs/fifofs would be the same it is now.  Curren=
tly
> there is already a VOP_UNLOCK/VOP_LOCK in specfs/fifofs.

vn_lock() isn't a transaction lock, we use it as an atomicity lock. So as=
=20
long as you don't unlock/relock in the middle of your atomic operation.=20
i.e. you unlock/lock either before or (if you're weird) after the "read"=20
or the "write", you're fine. I'm assuming POSIX atomicity here.

> > Thus if we move the transaction/snapshot logging into the read or write
> > call, we have fifofs and specfs skip that step, and we're fine.
>=20
> See above.  I think it is easier to maintain if we try to keep the transa=
ction
> lock completely outside of file systems.

Then we need buckets of them. If I understood your earlier discussions, we=
=20
then need transaction locking around every caller into the VFS/VOP layer.=
=20
That seems messier to maintain.

> > What do other OSs do?
>=20
> No OS I know of has something like this.  Do you have a special OS in min=
d?

Yes, they don't have this. But other OSs handle snapshotting. How do they=
=20
handle the suspension? Do they bother? If so, how do they do it? If they=20
don't, why do we have a problem and they don't?

We're adding a new locking hierarcy. I think we should look at prior art=20
before we go too far. If we need the new hierarcy, we will do it. But=20
let's make sure we didn't overlook a cool idea somewhere else first.

Take care,

Bill

--dc+cDN39EJAMEtIO
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFEoFJAWz+3JHUci9cRAgfaAJ4nQHefHrJuYzsdqVxNYwbMqN5QHACfV8zj
zxVCPFocfJGEaH/fB7bSOzk=
=a/4g
-----END PGP SIGNATURE-----

--dc+cDN39EJAMEtIO--