tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Anomalies while handling p_nstopchild count



    Date:        Tue, 13 Oct 2015 02:27:47 +0000
    From:        Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
    Message-ID:  <20151013022635.578F5607B8%jupiter.mumble.net@localhost>


  | (a) The protocol is that under proc_lock but *not* p->p_lock, p_waited
  | must be written before p_stat, and p_stat must be read before
  | p_waited, as an optimization to avoid mutex/atomics.

If that's really the protocol, it is violated at least as often as not
(p_waited is probably more often set after p_stat = SSTOP than before).
[Aside: if p_stat is set to anything other than SSTOP, p_waited is
irrelevant, so case of p_stat = SSTOP; is the only interesting one.]

I suspect that the real intent is that even though p_stat is controlled
in general by p_lock, and not proc_lock, for the case of setting p_stat
to SSTOP, proc_lock must be held as well.  proc_lock is required when
setting p_waited.

After Paul's recent fixes, from what I can see, that is now true everywhere,
except in the one case in kern_exit.c where proc_lock is released
immediately before p_stat = SSTOP;   It would probably be worthwhile to
reverse the order of those two operations (and cannot possibly hurt.)

For accesses, proc_lock is always held (with 1, or perhaps 3, exception(s)
I will get to below) before p_waited is tested (and before p_stat is compared
to SSTOP, at least where p_waited will be tested as well.)

With this, I believe that everything is OK, the mutex_enter() and
mutex_exit() that must occur between writing p_stat and p_waited, and
them being read will guarantee consistent results, regardless of
which order they're actually written or read.   That is probably true
even without swapping the order of p_stat = SSTOP; and mutex_exit(&proc_lock)
in kern_exit.c

The one exception is in compat/linux/arch/*/linux_ptrace.c where p_stat
and p_waited are read (p_stat tested for SSTOP) without proc_lock held.

That is one exception, or three, depending how you count it (the '*'
expands to arm, i386, and powerpc - though all have identical code.)

I'd suggest applying one of the following two (untested) alternative patches
(after validating them) to each of those three files (or better, move the
common code someplace else, apply the patch there, once, and #include it
in the three files...)

Fix choice one ... hold proc_lock longer...

--- linux_ptrace.c	2014-11-10 14:47:40.000000000 +0700
+++ linux_ptrace.c_FIX1	2015-10-13 11:02:27.000000000 +0700
@@ -185,7 +185,6 @@
 		goto out;
 	}
 	mutex_enter(t->p_lock);
-	mutex_exit(proc_lock);
 
 	/*
 	 * You cannot do what you want to the process if:
@@ -193,6 +192,7 @@
 	 */
 	if (!ISSET(t->p_slflag, PSL_TRACED)) {
 		mutex_exit(t->p_lock);
+		mutex_exit(proc_lock);
 		error = EPERM;
 		goto out;
 	}
@@ -205,9 +205,11 @@
 	if (ISSET(t->p_slflag, PSL_FSTRACE) || t->p_pptr != p ||
 	    t->p_stat != SSTOP || !t->p_waited) {
 		mutex_exit(t->p_lock);
+		mutex_exit(proc_lock);
 		error = EBUSY;
 		goto out;
 	}
+	mutex_exit(proc_lock);
 	/* XXX: ptrace needs revamp for multi-threading support. */
 	if (t->p_nlwps > 1) {
 		mutex_exit(t->p_lock);

Fix choice two .. read p_stat and p_waited while proc_lock remains
held, but test the values read later, after it has been released...

--- linux_ptrace.c	2014-11-10 14:47:40.000000000 +0700
+++ linux_ptrace.c_FIX2	2015-10-13 11:04:13.000000000 +0700
@@ -137,6 +137,7 @@
 	struct linux_emuldata *led;
 	int request, error, addr;
 	size_t fp_size;
+	int p_stat, p_waited;
 
 	if (linux_ptrace_disabled)
 		return ENOSYS;
@@ -185,6 +186,8 @@
 		goto out;
 	}
 	mutex_enter(t->p_lock);
+	p_stat = t->p_stat;
+	p_waited = t->p_waited;
 	mutex_exit(proc_lock);
 
 	/*
@@ -203,7 +206,7 @@
 	 * 4. It is not currently stopped.
 	 */
 	if (ISSET(t->p_slflag, PSL_FSTRACE) || t->p_pptr != p ||
-	    t->p_stat != SSTOP || !t->p_waited) {
+	    p_stat != SSTOP || !p_waited) {
 		mutex_exit(t->p_lock);
 		error = EBUSY;
 		goto out;


I doubt that anyone will suffer if this is not done however...

kre



Home | Main Index | Thread Index | Old Index