NetBSD vs Solaris condvar semantics

To: tech-kern%NetBSD.org@localhost
Subject: NetBSD vs Solaris condvar semantics
From: Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
Date: Sun, 14 Oct 2012 07:20:02 +0000

I'm working on fixing ZFS locking, and I ran into a diference between
NetBSD's and Solaris's interpretation of condvars.

In Solaris, it seems to be kosher to do

   cv_broadcast(cv);
   cv_destroy(cv);

at least if waiters use only cv_wait and not cv_wait_sig &c.  That
idiom makes NetBSD very unhappy, though, because cv_wait wants to
continue using cv after it is woken, at which point cv may already be
destroyed.

ZFS makes nontrivial use of this idiom, such as in dirent locks and
range locks.  One way to work around this is to do reference counting
for the condvars; that is, to change

   cv_wait(&frob->cv);

into

   frob_hold(frob);
   cv_wait(&frob->cv);
   frob_rele(frob);

and to change

   cv_destroy(&frob->cv);

into

   frob_rele(frob);

so if there are any waiters, we just let the last waiter destroy it.
This is what I started to do for dirent locks, until I saw that there
are more uses of the idiom, and I don't know how many others I'll come
across.

Alternatively, we could instead allow this idiom, and leave the ZFS
use of it as is, with a minor change to cv_wait in kern_condvar.c,
since it doesn't actually need cv for anything after waking.  (The
change to rump would be a trifle less minor.)

How tasteless would it be to change cv_wait to allow this idiom?
Should I just continue converting it to reference counts?

Follow-Ups:
- Re: NetBSD vs Solaris condvar semantics
  - From: David Holland
- Re: NetBSD vs Solaris condvar semantics
  - From: David Laight
- Re: NetBSD vs Solaris condvar semantics
  - From: Martin Husemann

Prev by Date: Re: version bump
Next by Date: Re: NetBSD vs Solaris condvar semantics
Previous by Thread: version bump
Next by Thread: Re: NetBSD vs Solaris condvar semantics
Indexes:

Home | Main Index | Thread Index | Old Index