Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys/kern

Andrew Doran wrote:
On Mon, Apr 20, 2009 at 10:09:55PM +0000, Elad Efrat wrote:

Module Name:    src
Committed By:   elad
Date:           Mon Apr 20 22:09:55 UTC 2009

Modified Files:
        src/sys/kern: kern_verifiedexec.c

Log Message:
PR/41251: YAMAMOTO Takashi: veriexec locking seems broken

Part 1: Take the mountlist_lock before traversing the mount list.

Thanks for looking into this. However, there are some problems with your

- mountlist_lock is a `leaf'. That is to say, it is never held for long and
  is never held for heavyweight operations. It also doesn't get intermixed
  with other locks.

- For similar reasons, I suspect your change may introduce a lock ordering
  between mountlist_lock and other locks, which is undesirable.

- The mountpoints you are looking at can be unmounted while you're examining
  them. The 'struct mount' won't disappear since you hold mountlist_lock,
  but see above.

The solution to this is vfs_busy+vfs_unbusy. Have a grep in kern/ for uses
of mountlist_lock. I think there are a few loops where all three are used
together. Basically, they give you this:

- vfs_busy() drops mountlist lock, but holds a reference to the mount for
  you, so it will not go away while you are inspecting it.

- vfs_unbusy() reacquires mountlist_lock, and lets you continue traversing
  the list from where you left off. It also drpos the reference.

- vfs_busy() locks the mount to prevent unmount().

Thanks for pointing these out! I did as you suggested -- please take a
look at the attached diff.

While looking around, a few questions came up:

  - Take a look at:

    It seems that if vnalloc() fails, we're not releasing the
    mountlist_mutex and not doing vfs_unbusy() -- is this a problem?
    (this holds for the other error paths in this function too)

  - It seems that the two functions that use CIRCLEQ_FOREACH() to
    traverse the mountlist don't vfs_busy()/vfs_unbusy(). Is this okay
    because they're supposed to be "light" users? (if yes, where do we
    draw the line between light/heavy uses? :)

  - Should I also convert to an "open coded" loop, or is the diff
    attached okay?

  - I'm not sure how much of a problem it is, but since we're dropping
    the mountlist lock in the middle, doesn't it mean it's possible to
    insert elements into the mountlist "before" elements we've already
    passed? (in other words, perhaps we should make the mountlist KPI
    only append to the list, rather than exposing it as a CIRCLEQ?)


Index: kern_verifiedexec.c
RCS file: /cvsroot/src/sys/kern/kern_verifiedexec.c,v
retrieving revision 1.113
diff -u -p -r1.113 kern_verifiedexec.c
--- kern_verifiedexec.c 20 Apr 2009 22:09:54 -0000      1.113
+++ kern_verifiedexec.c 22 Apr 2009 21:10:59 -0000
@@ -1539,8 +1539,16 @@ veriexec_dump(struct lwp *l, prop_array_
        CIRCLEQ_FOREACH(mp, &mountlist, mnt_list) {
+               struct mount *nmp;
+               /* If it fails, the file-system is [being] unmounted. */
+               if (vfs_busy(mp, &mnp) != 0)
+                       continue;
                fileassoc_table_run(mp, veriexec_hook,
                    (fileassoc_cb_t)veriexec_file_dump, rarray);
+               vfs_unbusy(mp, false, &nmp);
@@ -1555,11 +1563,18 @@ veriexec_flush(struct lwp *l)
        CIRCLEQ_FOREACH(mp, &mountlist, mnt_list) {
+               struct mount *nmp;
                int lerror;
+               /* If it fails, the file-system is [being] unmounted. */
+               if (vfs_busy(mp, &nmp) != 0)
+                       continue;
                lerror = veriexec_table_delete(l, mp);
                if (lerror && lerror != ENOENT)
                        error = lerror;
+               vfs_unbusy(mp, false, &nmp);

Home | Main Index | Thread Index | Old Index