tech-kern: Re: v_interlock/splbio protocol violations

Subject: Re: v_interlock/splbio protocol violations
To: Chuck Silvers <chuq@chuq.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 03/03/2004 20:36:50
--zhXaljGHf11kAtnf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Mar 03, 2004 at 07:15:25PM -0800, Chuck Silvers wrote:
> On Wed, Mar 03, 2004 at 04:12:35PM -0500, Darrin B. Jewell wrote:
> > I'm looking at ways around these problems.  Some possiblities
> > include:
> >   1. always take splbio before taking v_interlock.
> >      - impractical and huge imapct on other code
> >   2. split v_interlock into two separate interlocks, one to
> >      protect stuff used inside biodone and one for everything
> >      else.
> >   3. Move some or most of the work done by biodone into a separate
> >      helper thread context, similar to aiodoned, which can safely take
> >      the v_interlock.
> >=20
> > Right now, i'm favoring possibility #3, but I'm still examining the
> > problem.  I'm also considering putting v_dirtyblkhd and v_cleanblkhd
> > under the protection of the v_interlock and adding a new interlock
> > to protect syncer_workitem_pending/v_synclist.
>=20
> aiodoned is not something I'm especially happy with, I just did it that w=
ay
> for expediency.  I don't think we want to keep aiodoned or your proposed
> #3 biodoned in the longer term.  if a fix is really important in the
> short term, it's probably ok (since aiodoned doesn't seem to cause any
> problems really).

Why? How else will we fix this? We still have the problem that biodone()=20
wants to play with things in interrupt context that need locking. Solution=
=20
#3 gets the locking bit out of interrupt context. Even if we go with MP=20
interrupts-serviced-in-threads, I think we'll still want it as it lets us=
=20
reduce the need to sleep (waiting for the interlock) while interrupts are=
=20
disabled.

> I'm hoping that eventually we get some better MP locking primitives that =
don't
> require the calling code to know about the particular IPL prerequisites f=
or
> each lock acquisition.  then we would need to convert all the MP locking =
to
> use the new primitives once, but ideally after that we could change the MP
> locking implementation without changing all the callers again.
> I don't know if that's the discussion you wanted to have right now though.

That's a separate discussion.

Would it really help? Sure, I can see how we can tell each lock, "you
should be at spl X when locking." However I see a whole host of issues pop
up. 1) We need to use splraise() (not a biggie to be true). 2) If we
aquire a number of locks in a row, we can end up making redundant calls to
splraise that could have been avoided (ok, so an expert version could help
this).

3) (the biggie) what happens if you want to release spl-grabbing locks in
an order different from the LIFO of how you grabbed them? How do you make=
=20
sure you are still at the right slp levels both (a) while you still have=20
some but not all of the locks, and (b) when you release all the locks? And=
=20
how do you do that without exposing all the stuff you wanted to hide?

Take care,

Bill

--zhXaljGHf11kAtnf
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFARrJiWz+3JHUci9cRAj3xAJ9J4T0emyVEjb0eBFlq4LfxExZcGwCglUD0
NU+GgZa12I+nf6UdXokfACM=
=idao
-----END PGP SIGNATURE-----

--zhXaljGHf11kAtnf--