Subject: Re: Redoing file system suspension API (update)
To: Jason Thorpe <thorpej@shagadelic.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 06/29/2006 20:54:54
--R3G7APHDIzY6R/pk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jun 29, 2006 at 09:52:43PM +0200, Juergen Hannken-Illjes wrote:
> On Thu, Jun 29, 2006 at 11:04:08AM -0700, Jason Thorpe wrote:
> >=20
> > On Jun 29, 2006, at 9:54 AM, Juergen Hannken-Illjes wrote:
> >=20
> > >At least as a base.  Lockmgr locks lack the "do I have a shared =20
> > >lock" query.
> > >I need a lock where a thread already owning a shared lock succeeds =20
> > >when
> > >it wants another shared lock.  Last time I looked lockmg locks had =20
> > >this only
> > >for exclusive locks.

Well, the other thing we'd need is that if we have the lock exclusive and=
=20
want it shared, we succeed.

> > Um, I'm pretty sure that works already with lockmgr locks (e.g. =20
> > recursively acquiring a shared lock you already hold shared).  Just =20
> > make sure the acquire/releases are paired up and you should be fine.
> >=20
> > Let's walk through the logic:
> >=20
> >         case LK_SHARED:
> >                 if (WEHOLDIT(lkp, pid, lid, cpu_num) =3D=3D 0) {
> >=20
> > We fall into this case because the lock is held shared (WEHOLDIT() =20
> > returns false).
> >=20
> >                         /*
> >                          * If just polling, check to see if we will =20
> > block.
> >                          */
> >                         if ((extflags & LK_NOWAIT) && (lkp->lk_flags &
> >                             (LK_HAVE_EXCL | LK_WANT_EXCL | =20
> > LK_WANT_UPGRADE))) {
> >                                 error =3D EBUSY;
> >                                 break;
> >                         }
> >                         /*
> >                          * Wait for exclusive locks and upgrades to =20
> > clear.
> >                          */
> >                         error =3D acquire(&lkp, &s, extflags, 0,
> >                             LK_HAVE_EXCL | LK_WANT_EXCL | =20
> > LK_WANT_UPGRADE);
> >=20
> > We wait until no exlusive holders are in / waiting.
>=20
> And here we are.  I need shared lock recursion even if another thread wan=
ts
> an exclusive lock.  The transaction lock will work like:
>=20
> 	xxx_read()
> 	{
> 		vn_trans_lock()  <-- (1)
> 		...
> 		xxx_getpages()
> 		...
> 		vn_trans_unlock()
> 	}
>=20
> and
>=20
> 	xxx_getpages()
> 	{
> 		vn_trans_lock()  <-- (2)
> 		...
> 		vn_trans_unlock()
> 	}
>=20
> When waiting at (2) the lock at (1) will never release.

Ok. That's a mess.

We should look at what Solaris does.

I was thikning if a fix, then realized it won't work. Then I thought of=20
another, and it too won't work.

To really make this work, we would probably need to retool some of how the=
=20
UBC code works. We would need to differentiate between a page fault due to=
=20
a program (or some random old kernel routine) accessing memory and a page=
=20
fault due to VOP_READ() or VOP_WRITE() accessing memory it just mapped in.

The difference is that the first isn't in a transaction, and so needs to=20
implicitly start one. At the uvm level, however, the latter already is in=
=20
a transaction. So starting one will have the deadlock that you describe.


I'm not sure if getpages is a bad example, as whiel we would LIKE to slow=
=20
down reads to let writes and a sync complete, that's more a matter of i/o=
=20
scheduling. We won't break a snapshot with it. I however expect there are=
=20
other example lurking around that need addressing, and so your point is=20
quite valid.


I think the thing to do is have internal and external interface calls. We=
=20
only do transaction games at the external calls, and the internal calls=20
assume the right barrier was taken at the exterior.

We however still need some way to mark that we've got a transaction lock=20
as the deadlock between starting a transaction (getting the trans lock)=20
and a VOP_ entry (and the external interface locking I mention above)=20
still remains.

I don't think lockmgr locks will do this well, as they don't track lock=20
ownership for shared ownings.

Take care,

Bill

--R3G7APHDIzY6R/pk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFEpKCOWz+3JHUci9cRAn8qAJ44U5DURyGJBDJ1lXbmhPMwKz+nGwCfToJL
ofkUYKQpOWz27fwWuKY7uAI=
=a5Z0
-----END PGP SIGNATURE-----

--R3G7APHDIzY6R/pk--